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J (57) Abstract: Compositions and methods are provided to identify functional mutant ribosomes that may be used as drug targets. 

The compositions and methods allow isolation and analysis of mutations that would normally be lethal and allow direct selection 
5 of rRNA mutants with predetermined levels of ribosome fiinction. The compositions and methods of the present invention may be 
► used to identify antibiotics to treat a large number of human pathogens through the use of genetically engineered rRNA genes from 
*■ a variety of species. The invention further provides novel plasmid constructs to be used in the methods of the in 
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METHODS AND COMPOSITIONS FOR THE 
IDENTIFICATION OF ANTIBIOTICS THAT ARE NOT SUSCEPTIBLE TO 
ANTIBIOTIC RESISTANCE 

5 Related Applications 

The present application claims priority from U.S. provisional patent application 
serial no. 60/393^37, filed on July 1, 2002, and U.S. provisional patent application 
serial no. 60/452,012, filed on March 5, 2003, which is expressly incorporated by 
reference. 

10 

Background of the InventioB 

Ribosomes are composed of one large and one small subunit containing three or 
four RNA molecules and over fifty proteins. The part of the ribosome that is directly 
involved in protein synthesis is flie ribosomal RNA (rRNA). The ribosomal proteins are 

15 responsible for folding the rRNAs into their correct three-dimensional structures. 
Ribosomes and the protein synthesis process are very similar in all organisms. One 
difference between bacteria and other organisms, however, is the way that ribosomes 
recognize mRNA molecules tiiat are ready to be translated. In bacteria, tiiis process 
involves a base-pairing int^ction between several nucleotides near the beginning of the 

20 mRNA and an equal number of nucleotides at the end of the ribosomal RNA molecule in 
the small subunit The mRNA sequence is known as the Shine-Dalgamo (SD) sequence 
and its counterpart on the rRNA is called the Anti-Shine-Dalgamo (ASD) sequence. 

There is now extensive biochemical, genetic and phylogenetic evidence 
indicating that rRNA is directly involved in virtually every aspect of ribosome function 

25 (Garrett, R. A., a al. (2000) The Ribosome: Structure, Function. Antibiotics, and 

Cellular Interactions. ASM Press, Washington, DC). Genetic and fiinctional analyses of 
rRNA mutations in E. colt and most other organisms have been complicated by the 
presence of multiple rRNA genes and by the occurrence of dominant lethal rRNA 
mutations. Because there are seven rRNA operons in E. coll, the phenotypic expression 

3D of rRNA mutations may be affected by the relative amounts of mutant and wild-type 
ribosomes in the cell. Thus, detection of mutant phenotypes can be hindered by the 
presence of wild-type ribosomes. A variety of approaches have been designed to 
circumvent these problems. 

One common approach uses cloned copies of a wild-type rRNA operon (Brosius, 

35 J., et al (1 98 1) Plasmid 6: 112-11 8; Sigmund, C. D. et al. (1 982) Proc. Natl. Acad. Sci. 
U.S.A. 79: 5602-5606). Several groups have used this system to detect phenotypic 
differences caused by a high level of expression of mutant ribosomes. Recently, a strain 
of i^. colt was constracted in which the only supply of ribosomal RNA was plasmid 
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encoded (Asai, T., (1999)7. Bacteriol. 181: 3803-3809). This system has been used to 
stu^ transcriptional regulation of fRNA synthesis, as well as ribosomal RNA function 
(Voulgaris, J.,e/a/. (1999)7. Bacteriol. 181: 4170-4175; Koosha,H., etal. (2000) UNA. 
6: 1 1 66-1 1 73; Sergiev, P. V., et al. (2000) / Mol. Biol. 299: 379-389; O'Connor, M. et 
5 al (2001) Nud. Acids Res. 29: 1420-1425; O'Connor, M., et al. (2001) Nucl. Acids Res. 
29: 710-715; Vila-Sanjurjo, A. etal (2001)7. Mol Biol. 308: 457-463); Morosyuk S. 
v., et al. (2000) J. Mol. Biol. 300 (1):113-126; Morosyuk S. V., et al (2001) 7. Mol. 
Biol. 307 (1):197-210; and Morosyuk S. V., et al. (2001)7 Mol. Biol. 307 (l):211-228. 
Hui et al. showed that mRNA could be directed to a specific subset of plasmid-encoded 

10 ribosomes by altering the message binding site ^IBS) of the ribosome while at the same 
time altering the ribosome binding site (RBS) of an mRNA (Hui, A., et al. (1987) 
Methods Enzymol. 153: 432-452). 

Although each of the above methods has contributed significantly to the 
understanding of rRNA function, progress in this field has been hampered both by the 

IS complexity of translation and by difficulty in appljdng standard genetic selection 
techniques to these systems. 

Resistance to antibiotics, a matter of growing concern, is caused partly by 
antibiotic overuse. According to a study published by the Journal of the American 
Medical Association in 2001, between 1989 to 1999 American adults made some 6.7 

20 million visits a year to the doctor for sore throat. In 73% of those visits, the study found, 
the patient was treated with antibiotics, though only 5%-17% of sore throats are caused 
by bacterial infections, the only kind that respond to antibiotics. Macrolide antibiotics in 
particular are becoming extremely popular for treatment of upper respiratory infections, 
in part because of their typically short, convenient course of treatment. Research has 

25 linked such vast use to a rise in resistant bacteria and the recent development of multiple 
drug resistance has underscored the need for antibiotics which are highly specific and 
refractory to the development of drug resistance. 

Microorganisms can be resistant to antibiotics by fom- mechanisms. First, 
resistance can occur by reducing the amount of antibiotic that accumulates in the cell. 

30 Cells can accomplish tiiis by either reducing the uptake of the antibiotic into the cell or 
by pumping the antibiotic out of die cell. Uptake mediated resistance often occurs, 
because a particular organism does not have tfie antibiotic transport protein on the cell 
surface or occasionally when Ae constituents of tiie membrane are mutated in a way that 
interferes with transport of the antibiotic into a cell. Uptake mediated resistance is only 

35 possible in instances where the drug gains entry through a nonessential transport 
molecule. Efflux mechanisms of antibiotic resistance occur via transporter proteins. 
These can be hig^hly specific transporters that transport a particular antibiotic, such as 
tetracycline, out of the cell or they can be more general transporters that transport groups 
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of molecules with similar characteristics out of the cell. The most notorious example of 
a nonspecific transporter is the multidrug resistance transporter (MDR); 

Inactivating die antibiotic is another mechanism by which microorganisms can 
become resistant to antibiotics. Antibiotic inactivation is accomplished when an enzyme 
3 in &e cell chemically alters the antibiotic so that it no longer binds to its intended target. 
These enzymes are usually very specific and have evolved over millions of years, along 
with the antibiotics that they inactivate. Examples of antibiotics that are enzymatically 
inactivated are penicillin, chloramphenicol, and kanamycin. 

Resistance can also occur by modifying or oveipioducing die taiget site. The 

10 target molecule of the antibiotic is either mutated or chemically modified so fhat it no 
long binds the antibiotic. This is possible only if modification of the target does not 
interfere with normal cellular functions. Target site overproduction is less common but 
can also produce cells that are resistant to antibiotics. 

Lastly, target bypass is a mechanism by which microorganisms can become 

IS resistant to antibiotics. Jh bypass mechanisms, two metabolic pathways or targets exist 
in the cell and one is not sensitive to the antibiotic. Treatment with the antibiotic selects 
cells with more reliance on die second, antibiotic-resistant pathway. 

Among these mechanisms, the greatest concern for new antibiotic development is 
target site modification. Enzymatic inactivation and specific transport mechanisms 

20 require the existence of a substrate specific enzyme to inactivate or transport the 

antibiotic out of the cell. Enzymes have evolved over millions of years in response to 
naturally occurring antibiotics. Since microorganisms cannot spontaneously generate 
new enzymes, these mechanisms are unlikely to pose a significant threat to the 
development of new syndietic antibiotics. Taiget bypass only occurs in cells where 

25 redundant metabolic pathways exist. As understanding of the MDR transporters 

increases, it is increasingly possible to develop drugs that are not transported out of die 
cell by them. Thus, target site modification poses the greatest risk for the development 
of antibiotic resistance for new classes of antibiotic and this is particularly true for those 
antibiotics that target ribosomes. The only new class of antibiotics in thirty-five years, 

30 the oxazolidinones, is a recent example of an antibiotic that has been compromised 
because of target site modification. Resistant strains containing a single mutation in 
rRNA developed within seven months of its use in the clinical settings. 

Summary of the Invention 

35 The present invention provides compositions and methods which may be used to 

identify antibiotics that are not susceptible to the development of antibiotic resistance. 
In particular, rRNA genes fiom E. coli and other disease causing organisms are 
genetically engineered to allow identification of functional mutant ribosomes that may 
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be used as drug targets, e.g., to screen chemical and peptide libraries to identiiy 
compounds that bind to all functional mutant ribosomes but do not bind to human 
ribosomes. Antibiotics that recognize all biologically active forms of die target molecule 
and are therefore not susceptible to the development of drug resistance by target site 
S modification are thus identified. 

The invention provides plasmid constructs comprising an rRNA gene having a 
mutant ASD sequence set forth in Figures 12,13,15, and 16, at least one mutation in the 
rRNA gene, and a genetically engineered gene which encodes a selectable marker having 
a mutant SD sequence set forth in Figures 12, 13, 15, and 16. The mutant SD-ASD 

10 sequences are mutually compatible pairs and tiierefore permit translation of only the 
mRNA containing the compatible mutant SD sequence, translation of the selectable 
marker. In one embodiment, the selectable marker is chosen from the group consisting 
of chloramphenicol acetyltransferase (CAT), green fluorescent protein (GFP), or both 
CAT and GFP. In another embodiment, the DNA sequence encoding the rRNA gene is 

15 under the control of an inducible promoter. 

The rRNA gene may be selected from a variety of species, diereby providing for 
the identification of functional mutant ribosomes that may be used as drug targets to 
identify drug candidates that are effective against the selected species. Examples of 
species include, without limitation, Mycobacterium tuberculosis (tuberculosis), 

20 Pseudomonas aeruginosa (multidmg resistant nosocomial infections). Salmonella typhi 
(typhoid fever), Yersenia pestis (plague). Staphylococcus aureus (multidrug resistant 
infections causing impetigo, folliculitis, abcesses, boils, infected lacerations, 
endocarditis, meningitis, septic arflmtis, pneimionia, osteomyelitis, and toxic shock). 
Streptococcus pyogenes (streptococcal sore tiuoat, scarlet fever, impetigo, erysipelas, 

25 puerperal fever, and necrotizing fascitis), Enterococcus faecalis (vancomycin resistant 
nosocomial infections, endocarditis, and bacteremia). Chlamydia trachomatis 
(lymphogranuloma venereum, trachoma and inclusion conjunctivitis, nongonococcal 
urethritis, epididymitis, cervicitis, urethritis, infant pneumonia, pelvic inflammatory 
diseases, Reiter's syndrome (oligoarthritis) and neonatal conjunctivitis). Saccharomyces 

30 cerevesiae, Candida albicans, and tiypanosomes. In one embodiment, the rKNA gene is 
from Mycobacterium tuberculosis (see, eg.. Example 6 and Figure 17). 

In still other embodiments of the invention, the rRNA genes are mitochondrial 
rRNA genes, i.e., eukaryotic rRNA genes (eg., human mitochondrial rRNA genes). 

The plasmid constructs of the invention, such as the plasmid constructs set forth 

35 in Figures 22-26, may include novel mutant ASD and SD sequences set forth herein. In 
particular, the present invention provides novel mutant ASD sequences and novel mutant 
SD sequences, set forth in Figures 12, 13, IS, and 16, which may be used in tfie plasmid 
constructs and methods of the invention. The mutant ASD and mutant SD sequences 
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may be used as mutually compatible pairs (see Figures 12, 13, 15, and 16). It will be 
appreciated that the mutually comj^tible pairs of mutant ASD and SD sequences interact 
as pairs in the form of RNA and permit translation of only the mRNAs containing the 
compatible mutant SD sequence. 

5 In another aspect, the present invention provides a plasmid comprising an E. coli 

16S rRNA gene having a mutant ASD sequence, at least one mutation in said 16S rRNA 
gene, and a genetically engineered gene which encodes a selectable marker, e.g., GFP, 
having a mutant SD sequence. Id another embodiment, the 168 rRNA gene is from a 
q)ecies ofter than E. coli. In one embodiment, the mutant ASD sequence is selected 

10 from flie sequences set forth in Figures 12, 13, IS, and 16. In another embodiment, the 
mutant SD sequence is selected jfirom the sequences set forth in Figures 12, 13, IS, and 
16. In yet anotiier embodiment, the mutant ASD sequence and the mutant SD sequence 
are in mutually compatible pairs (see Figures 12, 13, 15, and 16). Each mutually 
compatible mutant SD and mutant ASD pair permits translation by the selectable 

IS marker. 

In one embodiment, the invention features a cell comprising a plasmid of the 
invention. In anodier embodiment, flie cell is a bacterial cell. 

In one embodiment, the invention provides a method for identifying functional 
mutant ribosomes comprising: 
20 (a) transforming a host cell with a plasmid comprising an rRNA gene having a 

mutant ASD sequence, at least one mutation in said rRNA gene, and a genetically 
engineered gene which encodes a selectable marker having a mutant SD sequence, 
wherein the mutant ASD and mutant SD sequences are a mutually compatible pair, 

(b) isolating cells via die selectable markei; and 
25 (c) identifying the rRNA from the cells from step (b), thereby identifying 

functional mutant ribosomes. 

In another embodiment, the invention features a method for identifying 
functional mutant ribosomes comprising: 

(a) transforming a host cell with a plasmid comprising an E. coli 16S rRNA gene 
30 having a mutant ASD sequence, at least one mutation in said 16S rRNA gene, and a 

genetically engineered gene which encodes GFP having a mutant SD sequence wherein 
the mutant ASD and mutant SD sequences are a mutually compatible pair, 

(b) isolating cells via the GFP; and 

(c) identifying the rRNA from flie cells from step (b), thereby identifying 
35 fimctional mutant ribosomes. 

bi yet another embodiment, the invention features a method for identifying 
fimctional mutant ribosomes that may be suitable as drug targets comprising: 
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(a) transforming a host cell with a plasmid comprising an rRNA gene having a 
mutant ASD sequence, at least one mutation in said rRNA gene, and a genetically 
engineered gene which encodes a selectable marker having a mutant SD sequence, 
wherein the mutant ASD and mutant SD sequences are a mutually compatible pair; 

5 (b) isolating cells via the selectable marker; 

(c) identifying and sequencing the rRNA from the cells from step (b), thereby 
identifying regions of interest; 

(d) selecting regions of interest fiom step (c); 

(e) mutating the regions of interest from step (d); 

10 (f) inserting the mutated regions of interest from step (e) into a plasmid 

comprising an rRNA gene having a mutant ASD sequence and a genetically engineered 
gene which encodes a selectable marker having a mutant SD sequence, wherein die 
mutant ASD and mutant SD sequences are a mutually compatible pair, 
(g) transforming a host cell with the plasmid from step (f); 
15 (h) isolating cells of step (g) via the selectable marker, and 

(i) identifying the rRNA from step (h), thereby identifying functional mutant 
ribosomes that may be suitable as dmg targets. 

In a further embodiment, the invention provides a method for identifying 
functional mutant ribosomes that may be suitable as drug targets comprising: 
20 (a) transforming a host cell with a plasmid comprising an E. coli 1 6S rRNA gene 

having a mutant ASD sequence, at least one mutation in said 1 6S rRNA gene, and a 
genetically engineered gene which encodes GFP having a mutant SD sequence wherein 
die mutant ASD and mutant SD sequences are a mutually compatible pair, 

(b) isolating cells via the GFP; 

25 (c) identifying and sequencing Ae rRNA from the cells from step (b), thereby 

identifying regions of interest; 

(d) selecting the regions of interest from step (c); 

(e) mutating the regions of interest from step (d); 

(f) inserting the mutated regions of interest from step (e) into a plasmid 

30 comprising an E. coli 1 6S rRNA gene having a mutant ASD sequence and a genetically 
engineered gene which encodes GFP having a mutant SD sequence, wherein the mutant 
ASD and mutant SD sequences are a mutually compatible pair; 

(g) transforming a host cell with the plasmid from step (f); 

(h) isolating cells of step (g) via the GFP; and 

35 (i) identifying the rRNA from step (h), thereby identifying functional mutant 

ribosomes that may be suitable as dmg targets. 

In one embodiment, the invention features a method for identifying drug 
candidates comprising: 
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(a) transfonning a host cell with a plasmid comprising an rRNA gene having a 
mutant ASD sequence, at least one point mutation in said rRNA gene, and a genetically 
engineered gene which encodes a selectable marker having a mutant SD sequence, 
wherein the mutant ASD and mutant SD sequences are a mutually compatible pair; 
5 (b) isolating cells via the selectable marker, 

(c) identifying and sequencing the rRNA from step (b) to identify the regions of 
interest; 

(d) selecting the regions of interest from step (c); 

(e) mutating the regions of interest from step (d); 

1 0 (f) inserting Uie mutated regions of interest from step (e) into a plasmid 

comprising an rRNA gent having a mutant ASD sequence and a genetically engineered 
gene which encodes a selectable marker having a mutant SD sequence, wherein the 
mutant ASD and mutant SD sequences are a mutually compatible pair; 
(g) transforming a host cell with the plasmid from step (f); 
15 (h) isolating the cells from step (g) via the selectable marker, 

(i) identifying fbe rRNA from step (h) to identify the frmctional mutant 
ribosomes; 

(j) screening drug candidates against functional mutant ribosomes finm step (i); 
(k) identifying the drug candidates from step (j) that bound to the functional 
20 mutant ribosomes from step (i); 

(1) screening the dmg candidates from step (k) against human rRNA; and 
(m) identifying the drug candidates from step (1) that do not bind to human 
tRNA, thereby identifying drug candidates. 

In one embodiment, Ae invention provides a method for identifying drug 
25 candidates comprising: 

(a) transfonning a host cell with a plasmid comprising an E. coli 16S rRNA gene 
having a mutant ASD sequence, at least one point mutation in said 16S rRNA gene, and 
a genetically engineered gene which encodes GFP having a mutant SD sequence, 
wherein the mutant ASD and mutant SD sequences are a mutually compatible pair; 
30 (b) isolating the cells via the selectable marker, 

(c) identifying and sequencing tiie rRNA from step (b) to identify the regions of 
interest; 

(d) selecting the regions of interest from step (c); 

(e) mutating the regions of interest from step (d); 

35 (f) inserting the mutated regions of interest from step (e) into a plasmid 

comprising an E. coli 16S rRNA gene having a mutant ASD sequence and a genetically 
engineered gene which encodes GFP having a mutant SD sequence, wherein the mutant 
ASD and mutant SD sequences are a mutually compatible pair; 
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(g) transforming a host cell with the plasmid fiom step (f); 

(h) isolating cells from step (g) via the selectable marker, 

(i) identifying the rKNA fix>m step (h) to identify the fiinctional mutant 
ribosomes; 

S (j) screening drug candidates against fhe functional mutant ribosomes from step 

(i); 

(k) identifying Uie drug candidates from step (j) that bound to the functional 
mutant ribosomes from step (i); 

(1) screening the drug candidates fifom step (k) against human 16S rRNA; and 
10 (m) identifying the drug candidates from step (1) diat do not bind to the human 

16S rRNA, thereby identifying drag candidates. 

It will be appreciated that the rRNA gene used in the methods of the present 
invention may be from the 16S rRNA, 23S rRNA, and SSS rRNA gene. 

Other features and advantages of the invention will be apparent from the 
IS following detailed description and claims. 



Brief Description of the Drawings 

Figure 1 depicts the plasmid construct pRNA123. The locations of specific sites 

20 in pRNAI23 are as follows: the 16S rRNA E. coli rmB operon conesponds to nucleic 
acids 1-IS42; the 16S MBS (message binding sequence) GGGAU corresponds to nucleic 
acids 1536-1540; the 16S-23S spacer region corresponds to nucleic acids 1543-1982; the 
23S rRNA of E. coli imB operon conesponds to nucleic acids 1983-4886; the 23S-5S 
spacer region corresponds to nucleic acids 4887-4982; Has 5S rRNA of E. coli rmB 

25 operon corresponds to nucleic acids 4983-5098; the terminator Tl of E. coli rmB operon 
corresponds to nucleic acids 5102-5145; the terminator Tl of E. coli rmB operon 
corresponds to nucleic acids 5276-5305; the bla (P-lactamase; ampicillin resistance) 
corresponds to nucleic acids 6575-7432; the replication origin corresponds to nucleic 
acids 7575-8209; the rop (Rop protein) corresponds to nucleic acids 8813-8622; the GFP 

30 corresponds to nucleic acids 10201-9467; the GFP RBS (ribosome binding sequence) 
AUCCC corresponds to nucleic acids 10213-10209; tiie trp' promoter corresponds to 
nucleic acids 10270-10230; the f/p' promoter corresponds to nucleic acids 10745-10785; 
the CAT RBS AUCCC corresponds to nucleic acids 10802-10806; the cam 
(chloramphenicol acetyltransferase: CAT) corresponds to nucleic acids 10814-1 1473; 

35 the lacf promoter corresponds to nucleic acids 1 1 782- 1 1 859; the lacP (lac repressor) 
corresponds to nucleic acids 1 1860-12942; and the lacUVS promoter corresponds to 
nucleic acids 12985-13026. 
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Figure 2 depicts a scheme for construction of pRNA9. The abbreviations in 
Figure 2 are defined as follows: Ap' , ampicillin resistance; cam, CAT gene; ladP, 
lactose repressor, flacUVS, lacUVS promoter; Vtrp', constitutive trp promoter. The 
restriction sites used are also indicated. 
5 Figure 3 depicts an autoradiogram of sequencing gels with pRNA8-rMBS-rRBS. 

The mutagenic MBS and RBS are shown: B 5 C, G, T; D 5 A, G, T; H 5 A, C, T; V 5 A, 
C, G. The start codon of cam and the 39 end of 16S rRNA are indicated. Panel A 
depicts the RBS of the CAT gene. Panel B depicts the MBS of the 16S rRNA gene. 

Figure 4 depicts a graph of the effect of MBSs on growth. The abbreviations in 
ID Figure 4 are defined as follows: pBR322; vector pRNA6; RBS 5 GUGUG, MBS S 
CACAC: pRNA9; RBS 5 GGAGG (wt), MBS 5 CCUCC (wt): and Clone 1X24; RBS 5 
AUCCC, MBS5GGGAU. 

Figure 5 depicts a scheme for constmction of pRNA122. The abbreviations in 
Figure 5 are defined as follows: Ap ' , ampicillin resistance; cam, CAT gene; lacf, 
15 lactose repressor, VlacUVS, lacUVS promoter, Vtrp'^ , constitutive trp promoter; N 5 A, 
C, G, and T. The four nucleotides mutated are underlined and the restriction sites used 
are indicated. 

Figure 6 depicts a plasmid-derived ribosome distribution and CAT activity. 
Cultures were induced (or not) in early log phase (as shown in Figure 4) and samples 

20 were withdrawn for CAT assay and total RNA preparation at the points indicated. Open 
squares represent the percent plasmid-derived rRNA in uninduced cells. Closed squares 
represent the percent plasmid-derived rRNA in induced cells. Open circles represent 
CAT activity in uninduced cells. Closed circles represent CAT activity in induced cells. 
Figure 7 depicts a scheme for construction of single mutations at positions 516 or 

25 535. The abbreviations in Figure 7 are defined as follows: Ap , ampicillin resistance; 
cam, CAT gene; lacf, lactose repressor, YlacUVS, lacUVS promoter, ^trp", constitutive 
trp promoter. C516 was substituted to V (A, C, or G) and A535 was substituted to B (C, 
G, or T,) in pRNA122 and the restriction sites that were used are also indicated. 

Figure 8 depicts the functional analysis of mutations constructed at positions S 1 6 

30 and 535 of 1 6S rRNA in pRNA122. Nucleotide identities are indicated in the order of 
516:535 and mutations are underlined. pRNA122 containing the wild-type MBS (wt 
MBS) was used as a negative control to assess the degree of MIC and the level of CAT 
activity due to CAT mRNA translation by wild-type ribosomes. Standard error of the 
mean is used to indicate the range of the assay results. 

35 Figure 9 depicts a description and use of oligodeoxynucleotides. Primer binding 

sites are indicated by the number of nucleotides fiom the 5' nucleotide of the coding 
region. Negative numbers indicate binding sites 5' to the coding region. 
Figure 10 describes several plasmids used in Example 4. 
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Figure 1 1 depicts the specificity of the selected recombinants. The 
concentrations of chloramphenicol used are indicated and the unit of MIC is micrograms 
of chloramphenicol/mL. 

Figure 12 depicts novel mutant ASD sequences and novel mutant SD sequences 
5 of the present invention. Figure 12 also shows a sequence analysis of chloramphenicol 
resistant isolates. The mutated nucleotides are underlined and potential duplex 
formations are boxed. CAT activity was measured twice for each culture and the unit is 
CPM/0.1 nL of culture/OD600. Induction was measured by dividing CAT activity in 
induced cells with CAT activity in uninduced cells. A -1 indicates no induction, while a 
10 +1 indicates induction with 1 mM IPTG. 

Figure 13 depicts novel mutant ASD sequences and novel mutant SD sequences 
of the present invention. Figure 1 3 also shows a sequence analysis of CAT mRNA 
mutants. Potential duplex formations are boxed and the mutated nucleotides are 
underlined. The start codon (AUG) is in bold. A -1 indicates no induction, while a +1 
15 indicates induction with I mM IPTG. 

Figure 14 depicts tiie effect of PseudouridineS16 Substitutions on subunit 
assembly. The percent plasmid-derived 30S data are presented as die percentage of the 
total 308 in each peak and in crude ribosomes. 

Figure 1 5 depicts novel mutant ASD sequences and novel mutant SD sequences 
20 of the present invention. 

Figure 16 depicts novel mutant ASD sequences and novel mutant SD sequences 
of the present invention. 

Figure 17 depicts a hybrid construct This hybrid construct contains a 16S rRNA 
from Mycobacterium tuberculosis. The specific sites on the hybrid construct are as 
25 follows: the part of rRNA from E. coli rmB operon corresponds to nucleic acids 1-93 1 ; 
the part of 168 rRNA from Mycobacterium tuberculosis rm operon corresponds to 
nucleic acids 932-1542; the 168 MBS (message binding sequence) GGGAU corresponds 
to nucleic acids 1536-1540; the terminator Tl of E. coli rmB operon corresponds to 
nucleic acids 1791-1834; the terminator T2 of £. co/i rmB operon corresponds to nucleic 
30 acids 196S-1994; the replication origin corresponds to nucleic acids 3054-2438; the bla 
(3-lactamase; ampicillin resistance) corresponds to nucleic acids 3214-4074; the GFP 
corresponds to nucleic acids 5726-4992; the GFP RBS (ribosome binding sequence) 
AUCCC corresponds to nucleic acids 5738-5734; the fAp' promoter corresponds to 
nucleic acids 5795-5755; the frp' promoter corresponds to nucleic acids 6270-6310; the 
35 CAT RBS (ribosome binding sequence) AUCCC corresponds to nucleic acids 6327- 
6331 ; the cam (chloramphenicol acetyltransferase; CAT) corresponds to nucleic acids 
6339-6998; the lacP promoter corresponds to nucleic acids 7307-7384; the lacf Qac 
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repressor) corresponds to nucleic acids 7385-8467; and the lacUVS promoter 
corresponds to nucleic acids 8S10-8SS1. 

Figure 18 depicts a plasmid map of pRNA122. 
Figure 19 depicts a table of sequences and MICs of functional mutants. 
5 Sequences are ranked by the minimum inhibitory concentration ("MIC") of 
chloramphenicol required to fully inhibit growth of cells expressing the mutant 
ribosomes. The nucleotide sequences ("Nucleotide sequence") are the 790 loop 
sequences selected firom the pool of functional, randomized mutants. Mutations are 
underlined. The number of mutations ("Number of mutations") in each mutant sequence 
10 aie indicated, as well as the number of occunences ("Number of occurrences") which 
represents the number of clones with the indicated sequence. The sequence and activity 
of the unmutated control, pRNA122 (WT, wild-type) is dq)icted in the first row of 
Figure 19, in which the MIC is 600 ng/ml. 

Figure 20 depicts the 790-loop sequence variation. In the consensus sequence R 
15 =AorG;N = A,C,GorU;M = AorC;H = A,CorU;W = AorU;Y = CorU;A= 
deletion; and underlined numbers indicate the wild-type E. coli sequence. 

Figure 21 depicts fimctional and fhermodynamic analysis of positions 787 and 
795. Mutations have been underlined and "n.d." represents not determined. Figure 21 
shows site-directed mutations ("Nucleotide") that were constructed using PGR, as 
20 described for the random mutants, except that the mutagenic primers contained 
substitutions corresponding only to positions 787 and 795. In order to determine 
ribosome function ("Mean CAT activity"), each strain was grown and assayed for CAT 
activity at least twice, the data were averaged, and presented as percentages of the 
unmutated control, pRNA122 + die standard error of tiie mean. The ratio of plasmid to 
25 chromosome-derived rRNA in 30S and 70 S ribosomes 0*% Mutant SOS in 30S peak/ 
70S peak") was determined by primer extension. Cultures were grown and assayed at 
least twice and the mean values are presented as a percentage of the total 308 in each 
peak + the standard error of the mean. Thermodynamic parameters ("Thermodynamics") 
are for the higher-temperature transition of model oligonucleotides and are the average 
30 of results for four or five different oligomer concentrations. Standard errors for the 
AG'"37 are + 5% (1 kcal = 4184 J). Enors in Tm are estimated as + 1 'C. All solutions 
were at pH 7. 

Figure 22 depicts the DNA sequence of pRNA8. 
Figure 23 depicts the DNA sequence of pRNA122. 
35 Figure 24 depicts the DNA sequence of pRNA123. 

Figure 25 depicts the DNA sequence of pRNA123 Mycobacterium tuberculosis - 
2 (pRNA123 containing a hybrid of E. coli and Mycobacterium tuberculosis 16S rRNA 
genes). 
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Figure 26 depicts the DNA sequence of pRep- Mycobacterium tuberculosis-2 
(containing a pucl9 derivative containing the rRNA opeion bom pRNA122; however, 
the 23S and 5S rRNA genes are deleted). 

Figures 2-14 may be found in Lee, K., et al. Genetic Approaches to Studying 
5 Protein Synthesis: Effects of Mutations at Pseudouridine 5 1 6 and A535 in Escherichia 
coli 16S rRNA. Symposium: Translational Control: A Mechanistic Perspective at the 
Experimental Biology 200 1 Meeting (200 1 ); and Figures 18-21 may be found in Lee, K. 
et al, J. Mol. Biol. 269: IZl-lAi (1997), all of which are expressly incorporated by 
reference herein. 

10 

Detailed Description of the Invention 

Compositions and methods are provided to identify functional mutant ribosomes 
suitable as drug targets. The compositions and methods allow isolation and analysis of 
mutations that would nomially be lethal and allow direct selection of rRNA mutants with 
IS predetermined levels of ribosome function. The compositions and methods of the 

present invention may be used to identify antibiotics to treat generally and/or selectively 
human pathogens. 

According to one embodiment of the invention, a functional genomics database 
for rRNA genes of a variefy of species may be generated. In particular, the rRNA gene 

20 is randomly mutated using a generalized mutational strategy. A host cell is then 
transformed with a mutagenized plasmid of the invention comprising: an rRNA gene 
having a mutant ASD sequence, the mutated rRNA gene, and a genetically engineered 
gene which encodes a selectable marker having a mutant SO sequence. The selectable 
marker gene, such as CAT, may be used to select mutants that are functional, eg., by 

2S plating the transformed cells onto growth medium containing chloramphenicol. The 
mutant rRNA genes contained in each plasmid DNA of the individual clones fi-om each 
colony are selected and characterized. The function of each of the mutant rRNA genes is 
assessed by measuring the amount of an additional selectable marker gene, such as GFP, 
produced by each clone upon induction of the rRNA operon. A functional genomics 

30 database may thus be assembled, which contains the sequence and functional data of the 
functional mutant rRNA genes. In particular, ftmctionally important regions of the 
rRNA gene that will serve as drug targets are identified by comparing the sequences of 
the functional genomics database and correlating the sequence with the amount of GFP 
protein produced. 

35 In another embodiment, the nucleotides in tiie functionally important target 

regions identified in the above methods may be simultaneously randomly mutated, e.g., 
by using standard methods of molecular mutagenesis, and cloned into a plasmid of the 
invention to form a plasmid pool containing random mutations at each of the nucleotide 
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positions in the target region. The resulting pool of plasmids containing random 
mutations is then used to transform cells, e.g., E. coli cells, and form a library of clones, 
each of which contains a unique combination of mutations in the target region. The 
library of mutant clones are grown in tiie presence of IPTG to induce production of the 
5 mutant rRNA genes and a selectable marker is used, such as CAT, to select clones of 
rRNA mutants containing nucleotide combinations of the target region that produce 
functional ribosomes. The rRNA genes producing functional ribosomes are sequenced 
and may be incorporated into a database. 

In yet another embodiment, a series of oligonucleotides may be synthesized that 

10 contain flie functionally-important nucleotides and nucleotide motifs within the target 
region and may be used to sequentially screen compounds and compound libraries to 
identify compounds that recognize (bind to) the functionally important sequences and 
motifs. The compounds that bind to all of tiie oligonucleotides are then counterscreened 
against oligonucleotides and/or other RNA containing molecules to identify drug 

15 candidates. Drug candidates selected by the methods ofdie present invention are thus 
capable of recognizing all of Hbs fimctional variants of tiie target sequence, i.e., the 
target cannot be mutated in a way diat the drug cannot bind, without causing loss of 
function to the ribosome. 

In still another embodiment, after the first stage mutagenesis of the entire rRNA 

20 is performed using techniques known in the art, e.g., error-prone PGR mutagenesis, the 
mutants are analyzed to identify r^ons within die rRNA tiiat are important for function. 
These regions are then sorted based on their phylogenetic conservation, as described 
herein, and are then used for fiirther mutagenesis. 

Ribosomal RNA sequences fiom each species are different and the more closely 

25 related two species are, the more their rRNAs are alike. For instance, humans and 
monkeys have very similar rRNA sequences, but humans and bacteria have very 
different rRNA sequences. These differences may be utilized for the development of 
very specific drugs witii a narrow spectrum of action and also for the development of 
broad-spectrum drugs that inhibit large groups of organisms that are only distantly 

30 related, such as all bacteria. 

b another embodiment, tiie fimctionally important regions identified above are 
divided into groups based upon whether or not they occur in closely related groups of 
organisms. For instance, some regions of rRNA are found in all bacteria but not in other 
organisms. Other areas of rRNA are found only in closely related groups of bacteria, 

35 such as all of the members of a particular species, e.g. , members of the genus 
Mycobacterium or Streptococcus. 

In a further embodiment, die regions found in very large groups of organisms, 
eg., all bacteria or all ftmgi, are used to develop broad-spectrum antibiotics that may be 
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used to treat infections from a large number of organisms within that group. The 
methods of the present invention may be performed on these regions and functional 
mutant ribosomes identified. These functional mutant ribosomes may be screened, for 
example, with compound libraries. 

5 Di yet another embodiment, regions that are located only in relatively small 

groups of organisms, such as all members of the genus Streptococcus or all members of 
the genus Mycobacterium, may be used to design narrow spectrum antibiotics that will 
only inhibit the grovtrth of organisms that &11 within these smaller groups. The methods 
of the present invention may be perfonned on these regions and functional mutant 

10 ribosomes identified. These functional mutant ribosomes will be screened, eg., 
compound libraries. 

The invention provides novel plasmid constructs, e.g. pRNA123 (Figures 1 and 
24). The novel plasmid constructs of die present invention employ novel mutant ASD 
and mutant SD sequences set forth in Figures 12, 13, 15 and 16. The mutant ASD and 
IS mutant SD sequences may be used as mutually compatible pairs (see Figures 12, 13, IS 
and 16). It will be appreciated that the mutually compatible pairs of mutant ASD and 
SD sequences interact as pairs in Has form of RNA, to pennit translation of only the 
mRNAs containing &e altered SD sequence. 

20 Definitions 

As used herein, each of tiie following terms has the meaning associated with it in 
this section. 

The articles "a" and "an" are used herein to refer to one or to more than one (i.e. 

to at least one) of Oie grammatical object of the article. By way of example, "an 
2S element" means one element or more than one element. 

An "inducible" promoter is a nucleotide sequence which, when operably linked 

with a polynucleotide which encodes or specifies a gene product, causes flie gene 

product to be produced in a living cell substantially only when an inducer which 

corresponds to die promoter is present in the cell. 
30 As used herein, the term "mutation" includes an alteration in the nucleotide 

sequence of a given gene or regulatory sequence from the naturally occurring or normal 

nucleotide sequence. A mutation may be a single nucleotide alteration (eg., deletion, 

insertion, substitution, including a point mutation), or a deletion, insertion, or 

substitution of a number of nucleotides. 
35 By the term "selectable marker" is meant a gene whose expression allows one to 

identify functional mutant ribosomes. 

Various aspects of the invention are described in further detail in the following 

subsections: 
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I. Isolated Nucleic Acid Molecules 

As used herein, the term "nucleic acid molecule" is intended to include DNA 
molecules {e.g., cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and 
5 analogs of die DNA or KNA generated using nucleotide analogs. The nucleic acid 
molecule can be single-stranded or double-stranded, but preferably is double-stranded 
DNA. 

The term "isolated nucleic acid molecule" includes nucleic acid molecules which 
are separated from other nucleic acid molecules which are present in tiie natural source 

10 of the nucleic acid. For example, with regards to genomic DNA, the term "isolated" 
includes nucleic acid molecules which are separated from the chromosome with which 
the genomic DNA is naturally associated. Preferably, an "isolated" nucleic acid is &ee 
of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5' and 
3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic 

1 5 acid is derived. Moreover, an "isolated" nucleic acid molecule, such as a cDNA 

molecule, can be substantially free of other cellular material, or culture medium, when 
produced by recombinant techniques, or substantially free of chemical precursors or 
other chranicals when chemically synthesized. 

A nucleic acid molecule of the present invention, e.g., a nucleic acid molecule 

20 having the nucleotide sequence set forth in Figures 12, 13, 15, and 16, or a portion 
thereof, can be isolated using standard molecular biology techniques and the sequence 
information provided herein. Using all or portion of the nucleic acid sequence set forth 
in Figures 12, 13, IS, and 16 as a hybridization probe, tfie nucleic acid molecules of the 
present invention can be isolated using standard hybridization and cloning techniques 

25 {e.g. , as described in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A 
Laboratory Manual. 2nd. ed.. Cold Spring Harbor Laboratory, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, 1989). 

Moreover, a nucleic acid molecule encompassing all or a portion of the sequence 
set forth in Figures 12, 13, 15, and 16 can be isolated by the polymerase chain reaction 

30 (PCR) using synthetic oligonucleotide primers designed based upon die sequence set 
forth in Figures 12, 13, IS, and 16. 

A nucleic acid of die invention can be amplified using cDNA, mRNA or, 
alternatively, genomic DNA as a template and appropriate oligonucleotide primers 
according to standard PCR amplification techniques. The nucleic acid so amplified can 

35 be cloned into an appropriate vector and characterized by DNA sequence analysis. 

Furthermore, oligonucleotides corresponding to the nucleotide sequences of the present 
invention can be prepared by standard synthetic techniques, e.g., using an automated 
DNA synthesizer. 
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In another preferred embodiment, an isolated nucleic acid molecule of the 
invention comprises a nucleic acid molecule which is a complement of the nucleotide 
sequence set forth in Figures 12, 13, 15, and 16, or a portion of any of these nucleotide 
sequences. A nucleic acid molecule which is complementary to the nucleotide sequence 
S shown in Figures 12, 13, 15, and 16, is one which is sufficiently complementary to the 
nucleotide sequence shown in Figures 12, 13, 15, and 16, such that it can hybridize to die 
nucleotide sequence shown in Figures 12, 13, 15, and 16, respectively, thereby forming a 
stable duplex. 

10 n. Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression 
vectors, containing a nucleic acid molecule of the present invention (or a portion 
thereof). As used herein, the term "vector" refers to a nucleic acid molecule capable of 
transporting another nucleic acid to which it has been linked. One type of vector is a 

15 "plasmid", which refers to a circular double stranded DNA loop into which additional 
DNA segments can be ligated. Another type of vector is a viral vector, wherein 
additional DNA segments can be ligated into the viral genome. Certain vectors are 
capable of autonomous replication in a host cell into which Aey are introduced {e.g., 
bacterial vectors having a bacterial origin of replication and episomal mammalian 

20 vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the 
genome of a host cell upon introduction into the host cell, and thereby are replicated 
along with the host genome. Moreover, certain vectors are capable of directing the 
expression of genes to which they are operatively linked. Such vectors are referred to 
herein as "expression vectors", hi general, expression vectors of utility in recombinant 

25 DNA techniques are often in the form of plasmids. In the present specification, 
"plasmid" and "vector" can be used interchangeably as the plasmid is die most 
commonly used form of vector. However, die invention is intended to include such 
other forms of expression vectors, such as viral vectors (e.g., replication defective 
retrovirases, adenovimses and adeno-associated viruses), which serve equivalent 

30 functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of 
the invention in a form suitable for expression of the nucleic acid in a host cell, which 
means that die recombinant expression vectors include one or more regulatory 
sequences, selected on the basis of the host cells to be used for expression, which is 
3S operatively linked to die nucleic acid sequence to be expressed. Within a recombinant 
expression vector, "operably linked" is intended to mean that the nucleotide sequence of 
interest is linked to the regulatory sequence(s) in a manner which allows for expression 
of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a 
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host cell when the vector is introduced into the host cell). The term "regulatoiy 
sequence" is intended to include promoters, enhancers and other expression control 
elements (eg., polyadenylation signals). Such regulatory sequences are described, for 
example, in Goeddel (1990) Methods Enzymol. 185:3-7. Regulatoiy sequences include 
5 those which direct constitutive expression of a nucleotide sequence in many types of 
host cells and those which direct expression of the nucleotide sequence only in certain 
host cells (eg., tissue-specific regulatory sequences). It will be appreciated by those 
skilled in the art that the design of the expression vector can depend on such factors as 
the choice of the host cell to be transformed, the level of expression of protein desired, 

10 and the like. The expression vectors of the invention can be introduced into host cells to 
thereby produce proteins or peptides, including fusion proteins or peptides, encoded by 
nucleic acids as described herein. 

Expression of proteins in prokaiyotes is most often carried out in E. coli with 
vectors containing constitutive or inducible promoters directing the expression of either 

15 fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
encoded dierein, usually to the amino terminus of die recombinant protein. Such fusion 
vectors typically serve three purposes: 1) to increase expression of recombinant protein; 
2) to increase flie solubility of tfie recombinant protein; and 3) to aid in the purification 
of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion 

20 expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion 
moiety and the recombinant protein to enable separation of the recombinant protein fi:om 
the fusion moiety subsequent to purification of flie fusion protein. Such enzymes, aiid 
flieir cognate recognition sequences, include Factor Xa, thrombin and enterokinase. 
Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D.B. 

25 and Johnson, K.S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, MA) 
and pRTTS (Pharmacia, Piscataway, NJ) which fiise glutathione S-transferase (GST), 
maltose E binding protein, or protein A, respectively, to the target recombinant protein. 

Examples of suitable inducible non-fiision E. coli expression vectors include 
pTrc (Amann et al. (1988) Gene 69:301-315) and pET 1 Id (Studier et al. (1990) 

30 Methods Enzymol. 1 85:60-89). Target gene expression from the pTrc vector relies on 
host RNA polymerase transcription &om a hybrid tip-lac fusion promoter. Taiget gene 
expression firom the pET 1 Id vector relies on transcription fh>m a T7 gnlO-lac fusion 
promoter mediated by a coexpressed viral RNA polymerase (T7 gnl). This viral 
polymerase is supplied by host strains BL21(DE3) or HMS174(DE3) from a resident 

35 prophage harboring a T7 gnl gene under the ttanscriptional control of the lacUV 5 
promoter. 

One strategy to maximize recombinant protein expression in E. coli is to express 
the protein in a host bacteria with an impaired capacity to proteolytically cleave the 
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recombinant protein (Gottesman,S.(l 990) A/efAo£is£7izywio/. 185:119-128). Another 
strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an 
expression vector so that 0ie individual codons for each amino acid are those 
preferentially utilized in £. co/i (Wada et a/. (199Z) Nucleic Acids Res. 20:2111-2118). 
S Such alteration of nucleic acid sequences of the invention can be carried out by standard 
DNA synthesis techniques. 

In another embodiment, the expression vector may be a yeast expression vector. 
Examples of vectors for expression in yeast S. cerevisiae include pYepSecl (Baldari et 
al. (1987) Embo J. 6:229-234), pMFa (Kurjan and Herskowitz (1982) Cell 30:933-943), 

10 pJRY88 (Schultz et al. (1987) Gene 54:1 13-123), pYES2 (Divitrogen Corporation, San 
Diego, CA), and picZ (Invitrogen Corp, San Diego, CA). 

In yet another embodiment, a nucleic acid of the invention is expressed in 
mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, B. (1987) Nature 329:840) and pMT2PC 

15 (Kaufman et al. (1987) EMBO J. 6:187-195). When used in mammalian cells, the 
expression vector's control functions are often provided by viral regulatory elements. 
For example, conmionly used promoters are derived from polyoma. Adenovirus 2, 
cytomegaloviius and Simian Virus 40. For other suitable expression systems for both 
prokaiyotic and eukaiyotic cells see chapters 16 and 17 of Sambrook, J. et al. , Molecular 

20 Cloning: A Laboratory Manual. 2nd ed.. Cold Spring Harbor Laboratory, Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, NY, 1989. 

In another embodiment, the recombinant mammalian expression vector is 
capable of directing repression of the nucleic acid preferentially in a particular cell type 
(e; ., tissue-specific regulatory elements are used to express the nucleic acid). Tissue- 

25 specific regulatory elements are known in the art Non-limiting examples of suitable 
tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al 
(1987) Genes Dev. 1 :268-277), lymphoid-specific promoters (Calame and Eaton (1988) 
Adv. Immunol 43:235-275), particular promoters of T cell receptors (Winoto and 
Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 

30 33:729-740; Queen and Baltimore (1983) Ce// 33:741-748), neuron-specific promoters 
(e.g., the neurofilament promoter, Byrne and Ruddle (1989) Proc. Natl Acad. Sci. USA 
86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), 
and mammaiy gland-specific promoters (e.g., milk whey promoter; U.S. Patent No. 
4,873,3 1 6 and European Application Publication No. 264,166). Developmentally- 

3S regulated promoters are also encompassed, for example by the murine hox promoters 
(Kessel and Gruss (1990) 5c/eiwre 249:374-379). 

Another aspect of the invention pertains to host cells into which a the nucleic 
acid molecule of the invention is introduced. The terms "host cell" and "recombinant 
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host cell" are used interchangeably herein. It is understood that such terms refer not only 
to the particular subject cell but to the progeny or potential progeny of such a cell. 
Because certain modifications may occur in succeeding generations due to either 
mutation or enviromnental influences, such progeny may not, in fact, be identical to the 

5 parent cell, but are still included within the scope of the term as used herein. 

A host cell can be any prokaryotic or eukaryotic cell. Other suitable host cells 
are known to those skilled in the art. 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via 
conventional transformation or transfection techniques. As used herein, the terms 

1 0 "transformation" and "transfection" are intended to refer to a variety of art-recognized 
techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including 
calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated 
transfection, lipofection, or electroporation. Suitable methods for transforming or 
tiansfecting host cells can be found in Sambrook et al. {Molecular Cloning: A 

15 Laboratory Manual. 2nd. ed.. Cold Spring Harbor Laboratory, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, 1989), and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 
expression vector and transfection technique used, only a small fraction of cells may 
integrate Uns foreign DNA into their genome. In order to identify and select these 

20 integrants, a gene that encodes a selectable marker (e.g. , resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest. Nucleic acid 
encoding a selectable marker can be introduced into a host cell on the same vector as that 
encoding a protein or can be introduced on a separate vector. Cells stably transfected 
with the introduced nucleic acid can be identified by drug selection (eg., cells that have 

25 incorporated the selectable marker gene will survive, while the other cells die). 

m. Uses and M ethods of the Invention 

The nucleic acid molecules described herein may be used in a plasmid constmct, 
e.g. pRNA123, to cany out one or more of the following methods: (1) creation of a 

30 fijnctiona! genomics database of the rRNA genes generated by the methods of the 

present invention; (2) mining of the database to identify functionally important regions 
of the rRNA; (3) identification of fiinctionally important sequences and stractural motife 
within each target region; (4) screening compounds and compound libraries against a 
series of functional variants of the target sequence to identify compounds that bind to all 

35 functional variants of the target sequence; and (5) counterscreening the compounds 
against nontarget RNAs, such as human ribosomes or ribosomal RNA sequences. 

This invention is further illustrated by the following examples, which should not 
be constraed as limiting. The contents of all references, patents and published patent 
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applications cited throughout this application, as well as the Figures and Appendices, are 
incoipoiated herein by reference. 

SPEOFIC EXAMPLES 

5 

EXAMPLE 1: IDENTIFICATION OF MUTANT SD AND MUTANT ASD 
COMBINATIONS 

It has been shown that by coordinately changing the SD and ASD, a particular 
1 0 mRNA containing an altered SD could be targeted to libosomes containing the altered 
ASD. This and all other efforts to modify the ASD, however, have proved lethal, as cells 
containing these mutations died within two hours after die genes containing them were 
activated. 

Using random mutagenesis and gaietic selection, mutant SD-ASD combinations 
1 5 were screened in order to identify nonlethal SD-ASD combinations. The mutant SD- 
ASD mutually compatible pairs are set fordi in Figures 12, 13 IS and 16. The mutually 
compatible pairs of mutant sequences interact as pairs in the form of RNA. The novel 
mutant SD-ASD sequence combinations of the present invention permit translation of 
only tiie mRNAs containing the altered SD sequence. 

20 

EXAMPLE 2: CONSTRUCTION OF THE pSNAlZS PLASMID 

A plasmid construct of the present invention identified as the pRNA123 plasmid, 
is set forth in Figures 1 and 24. E. coli cells contain a single chromosome with seven 

25 copies of the rRNA genes and all of the genes for the ribosomal proteins. The plasmid, 
pRNA123, in the cell contains a genetically engineered copy of one of the rRNA genes 
from E. coli and two genetically engineered genes that are not normally found in E. coli, 
referred to herein as a "selectable markers." One gene encodes the protein 
chloramphenicol acetyltransferase (CAT). This protein renders cells resistant to 

30 chloramphenicol by chemically modifying the antibiotic. Another gene, the Green 
Fluorescent Protein (GFP), is also included in the system. GFP facilitates high- 
throughput functional analysis. The amount of green light produced upon irradiation 
witii ultraviolet light is proportional to the amount of GFP present in the cell. 

Ribosomes from pRNA123 have an altered ASD sequence. Therefore, the 

35 ribosomes can only translate mRNAs that have an altered SD sequence. Only two genes 
in the cell produce mRNAs with altered SD sequences that may be translated by the 
plasmid-encoded ribosomes: the CAT and GFP gene. Mutations in rRNA affect the 
ability of the resulting mutant ribosome to make protein. The present invention dius 
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provides a system whereby the mutations in the plasmid-encoded rRNA gene only affect 
the amount of GFP and CAT produced. A decrease in plasmid ribosome function makes 
the cell more sensitive to chloramphenicol and reduces the amount of green fluorescence 
of the cells. Translation of die other mRNAs in die cell is unaffected since these 
5 mRNAs are translated only by ribosomes that come fiom the chromosome. Hence, cells 
containing functional mutants may be identified and isolated via die selectable marker. 

EXAMPLE 3: GENETIC SYSTEM FOR FUNCTIONAL ANALYSIS OF 
RIBOSOMALRNA 

10 

Identificatioii of Functionally Important Regions of rRNA. Functionally 
important regions of rRNA molecules that may be used as drug targets using a functional 
genomics approach maybe identified dirough a series of steps. Namely, in step I.a., the 
entire rRNA gene is randomly mutated using error-prone PGR or anoflier generalized 

IS mutational strategy. In step Lb., a host cell is then transformed with a mutagenized 
plasmid comprising: an rRNA gene having a mutant ASD sequence, at least one 
mutation in said rRNA gene, and a genetically engineered gene which encodes a 
selectable marker having a mutant SD sequence, and production of the rRNA genes from 
the plasmid are induced by growing die cells in the presence of IPTG. In step I.c., the 

20 CAT gene is used to select mutants that are functional by plating the txansfomied cells 
onto growdi medium containing chloramphenicol. In step I.d., individual clones &om 
each of the colonies obtained in step I.c. are isolated. In step I.e., the plasmid DNA from 
each of the individual clones from stq> I.d. is isolated. In step If., the rRNA genes 
contained in each of the plasmids that had been isolated in step I.e. are sequenced. In 

25 step I.g., die fimction of each of the mutants from step I.f. is assessed by measuring the 
amount of GFP produced by each clone from st^ I.e. upon induction of the rRNA 
operon. In step Lh., a functional genomics database is assembled containing the 
sequence and fimctional data from steps I.f. and I.g. Li step Li., fimctionally important 
regions of die rRNA gene that will serve as drug targets are identified. Functionally 

30 important regions may be identified by comparing the sequences of all of the functional 
genomics database constructed in step Lg. and correlating the sequence with the amount 
of GFP protein produced. Contiguous sequences of three or more rRNA nucleotides, in 
which substitution of the nucleotides in the region produces significant loss of fimction, 
will constitute a fimctionally important region and dierefore a potential drug target. 

35 

Isolation of Functional Variants of the Target Regions. A second aspect of 
the invention features identification of mutations of the target site that might lead to 
antibiotic resistance using a process termed, ''instant evolution", as described below. In 



wo 2004/003511 



PCTAJS2003/020963 



22 

step II.a., for a given target region identified in step I.i., each of the nucleotides in the 
target region is simultaneously randomly mutated using standard methods of molecular 
mutagenesis, such as cassette mutagenesis or PCR mutagenesis, and cloned into the 
plasmid of step I.b. to fomi a plasmid pool containing random mutations at each of the 

5 nucleotide positions in the target region. In step n.b., the resulting pool of plasmids 
containing random mutations from step n.a. is used to transform E. coli cells and form a 
library of clones, each of which contains a unique combination of mutations in the target 
region. In step ll.c., the library of mutant clones firom step II.b. is grown in the presence 
of IPTG to induce pioduction of die mutant rRNA genes. In step Il.d., the induced 

10 mutants are plated on medium containing chloramphenicol, and CAT is used to select 
clones of rRNA mutants containing nucleotide combinations of the target region fliat 
produce functional ribosomes. In step n.e.., flie functional clones isolated in step U.d. 
are sequenced and GFP is used to measure ribosome function in each one. In step n.£, 
the data fiom step n.e. are incorporated into a mutational database. 

15 

Isolation of Drag Leads. In step III.a., the database in step II.f. is analyzed to 
identify functionally-important nucleotides and nucleotide moti& within the target 
region. Id. step in.b., the information from step III.a. is used to synthesize a series of 
oligonucleotides that contain the functionally important nucleotides and nucleotide 

20 motifs identified in step lU.a. In step in.c., the oligonucleotides from step in.b. are 
used to sequentially screen compounds and compound libraries to identify compounds 
that recognize (bind to) the functionally important sequences and motifs. In step III.d., 
compotmds that bind to all of the oligonucleotides are counterscteened against 
oligpnucleotides and/or other RNA containing molecules to identify drug candidates. 

25 'T)rug candidates" are compounds that 1) bind to all of flie oligonucleotides containing 
tiie functionally important nucleotides and nucleotide motifs, but do not bind to 
molecules that do not contain the functionally important nucleotides and nucleotide 
motifs and 2) do not recognize human ribosomes. Drug candidates selected by the 
methods of the present invention therefore recognize all of the functional variants of the 

30 target sequence, the target cannot be mutated in a way that the drug cannot bind, 
without causing loss of function to tiie ribosome. 

EXAMPLE 4: GENETIC SYSTEM FOR STUDYING PROTEIN SYNTHESIS 

35 Materials and Methods 

Reagents. All reagents and chemicals were as in Lee, K., et al. (1996) RNA 2: 
1270-1285. PCR-directed mutagenesis was performed essentially by the method of 
Higuchi, R. (1989) PCR Technology (Erlich, H. A., ed.), pp. 61-70. Stockton Press, New 
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York, NY. The primers used in the present invention are listed in Figure 9. The 
plasmids used in the present invention are listed in Figure 10. 

Bacterial strains and media. All plasmids were maintained and expressed in E. 
co/i DH5 istq>E44. hsdR17. recAJ. endAl, gyrA96, thi-1 and relAl) (36). To induce 

5 syndiesis of plasmid-derived rRNA from the lacUVS promoter, IPTG was added to a 
final concentration of 1 mM. Chloramphenicol acetyltransferase activity was determined 
essentially as described by Nielsen e/ a/. (\9&9) Anal. Biochem. 179: 19-23. Cultures 
for CAT assays were grown in LB-AplOO. MIC were determined by standard methods 
in microtiter plates as described in Lee, K., et al. (1997) J. Mol. Biol. 269: 732-743. 

1 0 Primer extension. To determine the ratio of plasmid to chromosome-derived 

rRNA, pRNA104 containing cells growing in LB-AplOO were harvested at the time 
intervals indicated and total RNA was extracted using the Qiagen RNeasy kit 
(Chatsworth, CA). The 30S, 70S, and crude ribosomes were isolated from 200 mL of 
induced, plasmid containing cells by die method of Powers and Noller (Powers, T. et al. 

15 (1991) EMBO J. 10: 2203-2214). The purified RNA was analyzed by primer extension 
according to Sigmund, C. D., et al. (1988) Methods Enzymol. 164: 673-690. 

Experimental Procedures 

GeneraHon ofpRNA9 construct. Hnt initial construct, pRNA9, was generated 

20 using flie following methods. Plasmid pRNA9 contains a copy of the rmB operon fifom 
pKK353S under transcriptional regulation of die lacUVS promoter; this well- 
characterized promoter is not subject to catabolic repression and is easily and 
reproducibly inducible with isopropyl-P-D-thiogalactoside (IPTG). To minimize 
transcription in the absence of inducer, PCR was used to amplify and subclone the lac 

25 repressor variant, lacf (Calos, M. P. (1978) Nature 274: 762-765) from pSPORTl (Life 
Technologies, Rockville, MD). The chloramphenicol acetyltransferase gene (cam) is 
present and transcribed constitutively from a mutant tryptophan promoter, trp' (De Boer, 
H. A., et al. (1983) Proc. Natl. Acad. Sci. U.S.A. 80: 21-25; Hui, A., et al. (1987) Proc. 
Natl. Acad. Sci. U.S.A. 84: 4762^766). The P-lactamase gene is also present to allow 

30 maintenance of plasmids in the host strain. To allow genetic selection, the CAT 
structural gene from pJLS1021 (Schottel, J. L., et al. (1984) Gene 28: 177-193) was 
amplified and placed downstream of a constitutive trp' promoter using PCR. Expression 
of the CAT gpne in E. coli rmders the cell resistant to chloramphenicol and die minimal 
inhibitory concentration, hereinafter refened to as MIC, of chloramphenicol increases 

35 proportionally with the amount of CAT protein produced (Lee, K., et al. (1 996) RNA 2: 
1270-1285; Lee, K., et al. (1997) J. Mol. Biol. 269: 732-743) An overview of the steps 
used to construct the system is shown in Figure 2. 
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Selection of a new HfBS-RBS pair. To isolate message binding site-ribosome 
binding site, hereinafter referred to as MBS-RBS, combinations that are nonlethal and 
efficiently translated only by plasmid-derived ribosomes, a random mutagenesis and 
selection scheme were used. In particular, the plasmid-encoded 16S MBS and CAT 
5 RBS were randomly mutated using PCR so that the wild-type nucleotide at each position 
was excluded. An autoradiogram of sequencing gels with pRNAS-rMBS-rRBS is 
provided in Figure 3. The resulting 2.5x10* doubly mutated transformants were 
induced for 3.5 hours in SOC medium containing 1 mM IPTG and plated on Luria broth 
medium containing 100 |ig/mL ampicillin, 350 |ig/mL chloramphenicol and 1 mM 
10 IPTG. To confirm the presence of all three alternative nucleotides at each mutated 
position, plasmid DNA fiom approximately 2.0 x 10^ transformants was sequenced 
(Figure 3). 

Results 

15 The data show that all of the nonexcluded nucleotides were equally represented 

in the random pool. Of the 2.5 xl 0* transformants plated, 536 survived the 
chloramphenicol selection. The efficiency of the selected MBS-RBS combinations was 
determined by measuiing the minimal inhibitory concentration, hereinafter lefetred to as 
MIC, of chloramphenicol for each survivor in the presence and absence of inducer 

20 (Figure 1 1) (Lee, K., et al. (1996) RNA 2: 1270-1285; Lee, K.. et al. (1 997) J. Mol. Biol. 
269: 732-743). Nine of the isolates (1.7%) showed MIC in the presence of inducer, 
which were lower than the 350 ng/mL concentration at which they were selected. These 
were slow growing mutants that appeared after 48 hours diuing the initial isolation. The 
MIC, however, were scored after only 24 hours. The MIC for 451 of the isolates (84.1%) 

25 were between 400 and 600 (ig/mL, and the remaining 76 clones (14.2%) were 600 
(jg/mL. The difference in chloramphenicol resistance between induced and uninduced 
cells ( AMIC) is the amount of CAT translation by plasmid-derived ribosomes only. A 
specific interaction between plasmid-derived ribosomes and CAT mRNA was indicated 
in 79 (14.7%) of flie clones, which showed four- to eightfold increases in CAT resistance 

30 upon addition of IPTG (Figure 11). 

Based on these analyses, 1 1 clones were retained for additional study. The MBS 
and RBS in plasmids finm these clones were sequenced and CAT assays and growth 
curves were performed (Figures 4 and 12). Although a wide range of inducibility was 
observed, fliere was no correlation between specificity and predicted free energy (AG 

35 '^^). Purines were preferred in all of the MBS positions, but the RBS did not show this 
sort of selectivity. This can be explained partially by the observation that the selected 
RBS can base pair with sequences adjacent to the mutated region of 16S rRNA (Lee, K., 
etal. {1996) RNA 2: 1270-1285). 
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Growth curves were performed for all of the selected mutants and compared with 
strains containing control constructs (Figure 4). Only one mutant (1X24) is shown in 
Figure 4, but all strains containing the selected MBS/RBS sequences showed the same 
pattern of growth as this mutant. Because of its induction profile, strain 1X24 
5 (containing plasmidpRNAlOO) was chosen for additional experimentation. To 
eliminate the possibility that mutations outside the MBS and RBS had been 
inadvertently selected, the DralE andXbdl fragment containing the MBS and the Kpnl 
andXhol firagment containing the RBS sequence from pRNAlOO (Figure 5) were 
transferred to pRNA9. 

10 

Specificity of the system. The rate of ribosome induction and the ratio of plasmid 
to chromosome-derived rRNA at each stage of growth were determined. For this, a 
pRNAlOO derivative, pRNA104, which contains a CI 192U mutation in 16S rRNA was 
constructed (Sigmund, C. D., et al. (1984) Nucleic Acids Res. 12: 4653-4663; Triman, 

15 K., et al. (1989) J. Mol. Biol. 209: 645-653) so that plasmid-derived rRNA could be 
differentiated from wild-type rRNA by primer extension. The CI 192U mutation does 
not affect ribosome function in other expression systems (Sigmund, C. D., et al. (1984) 
Nucleic Acids Res. 12: 4653-4663; Makosky, P. C. et al. (1987) Biochimie 69: 885- 
889). To show that the same is true in the present system, CAT activity was measured 

20 after 3 hours induction with 1 mM IPTG in DH5 cells expressing pRNA 1 00 or 

pRNA104 and the two were compared. In these experiments, no significant difference 
between cells expressing pRNA104 (99.2 + 2.8%) or pRNAlOO (100%) was observed. 

To deteimine tfie percentage of plasmid-derived ribosomes in cells containing die 
plasmid, total RNA was isolated finm DHS cells carrying pRNA104 before and after 

25 induction wath IPTG and subjected to primer extension analysis (Lee, K., et al. (1997) J. 
Mol. Biol. 269: 732-743; Sigmund, C. D., et al. (1984) Nucleic Acids Res. 12: 4653- 
4663; Makosky, P. C. et al. (1987) Biochimie 69: 885-889). Maximum induction of 
plasmid-derived ribosomes occurred 3 hours after induction at which point they 
constituted approximately 40% of the total ribosome pool (Figure 6). CAT activities in 

30 these cells paralleled induction of plasmid-derived ribosomes and began to decrease 4 
hours after induction, presumably due to protein degradation during stationary phase. In 
uninduced cells, approximately 3% of the total ribosome pool contains plasmid-derived 
ribosomes because of basal level transcription from the lacUVS promoter. 

35 Optimization of the system. Chloramphenicol resistance in uninduced cells 

containing pRNAlOO is 75 fig/mL (Figure 13, MIC = 100 ng/mL). By measuring CAT 
resistance in a derivative of pRNAlOO containing a wild-type 16S rRNA gene, it was 
determined that approximately one-half of diis background activity was due to CAT 
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translation by wild-type ribosomes (Figure 13, pRNAIOO 1 wt MBS). The remaining 
activity in uninduced cells is presumably due to leakiness of flie lacUVS promoter 
(Figure 6). The nucleotide sequence located between the RBS and the start codon in 
mRNA affects translational efficiency (Calos, M. P. (1978) Nature 21 A: 762-765; 

5 Stomo, G. D., et al. (1982) Nucleic Acids Res. 1 0: 2971-2996; Chen, H., et al. (1994) 
Nucl. Acids Res. 22: 4953-4957). In pRNAlOO, three of the nucleotides found in this 
region of the CAT mRNA are complementary with the 3' terminus of wild-type E. coli 
16S RNA (Figure 1 1, pRNAlOO 1 wt MBS). To eliminate the possibility that this was 
contributing to CAT translation in the absence of plasmid-encoded ribosomes, four 

10 nucleotides in the CAT gene (underlined in Figure 1 1) were randomly mutagenized and 
screened to identify mutants with reduced translation by host ribosomes. A total of 2000 
clones were screened in tiie absence of plasmid-encoded ribosomes using pCAM9 and 
six poorly translated CAT sequences were isolated (Figure 1 3). Next, the BamYQ. 
fiagment of pRNAlOO containing lacf^ and the rmB operon was added, and MIC, CAT 

15 assays and growth curves were performed on cells expressing these constructs (data not 
shown). 

Based on these data, pRNA122 was chosen because it produced a slightly better 
induction profile than the others (Figures 1 1 and 23). Translation of the pRNA122 CAT 
message by wild-type ribosomes (Figure 11, pRNA122 1 wt MBS) produces cells that 

20 are sensitive to chloramphenicol concentrations <10 ^g/mL. In the presence of 

specialized ribosomes (Figure 13, pRNA122), the background chloramphenicol MIC is 
between 40 and 50 fig/mL and the MIC for induced cells is between 550 and 600 
l^mL, producing an approximately! 3-fold increase in CAT expression upon induction 
in pRNA122. Induction of the rmB operon in pRNAlOO produces only an eightfold 

25 increase. 

Use of the system. To test the system, the effects of nucleotide substitutions at 
the sole pseudouridine in £". coli 16S rRNA, located at position 516 were examined. 
Because pseudouridine and U form equally stable base pairs widi adenosine (Maden, B. 

30 E. (1990)iVvg. Nucleic Acid Res. Mol. Biol. 39: 241-303), mutations at A53S were also 
constructed to determine whether the potential for base pair formation between these two 
loci affected ribosome fiinction. The mutations were constructed initially in a pUC19 
(Yanisch-Perron, C, et al. (1985) Gene 33: 103-1 19) derivative containing the 16S 
RNA gene, pl6ST, as shown in Figure 7 and then transferred to pRNA122 for analysis. 

35 This two-step process was used, because the SacB. restriction site located between the 
two mutated positions is unique in pRNA16ST and is not unique in pRNA122. The 
effect of the mutations in pRNA122 on protein synthesis in vivo was determined by 
measuring the MIC and CAT activity of the mutant cells (Figure 8). At position 516, 
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ribosomes containing the single transition mutation, pseudouridine516C, produced 
approximately 60% of the amount of functional CAT protein produced by wild-type 
ribosomes. The transversion mutations, pseudouridineS 1 6A or pseudouridineS 1 6G, 
however, reduced ribosome function by> 90%. All of the single mutations at position 
5 53 5 retained > 50% of the function of wild-type ribosomes. To examine the possibility 
that the potential for base pairing between positions 516 and 535 is necessary for 
ribosome function, all possible mutations between these loci were also constructed and 
analyzed (Figure 8). These data show that all of the double mutants were inactive (10% 
or less of the wild-type) regardless of the potential to base pair. To examine tiie reasons 

ID for loss of function in the S16 mutants, ribosomes fixnn cells expressing single mutations 
at position S16 were fiactionated by sucrose density gradient centri&gation and the 30S 
and 70S peaks were analyzed by primer extension to determine the percentage of 
plasmid-derived 30S subunits present The data in Figure 14 show a strong correlation 
between ribosome function and the presence of plasmid-derived ribosomes in the 70S 

IS ribosomal firaction, indicating that mutations at positions 516 affect the ability of the 
mutant SOS subunits to form 70S ribosomes. 

The references cited in Example 4 may be found in Lee, K., et al. Genetic 
Approaches to Studying Protein Synthesis: Effects of Mutations at Pseudouridine516 
and A535 in£scAenc/i/a coli 16S rRNA. Symposium: Translational Control: A 

20 Mechanistic Perspective at the Experimental Biology 200 1 Meeting (200 1 ) and at Lee, 
K. et al. (2001) Genetic Approaches to Studying Protein Synthesis: Effects of Mutations 
at PseudouridineS 16 and A535 in Escherichia coli 16S rSNA. J. Nutrition 131 
(ll):2994-3004. 



30 EXAMPLE 5: IN VIVO DETERMINATION OF RNA STRUCTURE-FUNCTION 
RELATIONSHIPS 

Materials and Methods 

Reagents. Restriction enzymes, ligase, AMY reverse transcriptase and calf 
35 intestine alkaline phosphatase were from New England Biolabs and from Gibco-BRL. 
Sequenase modified DNA polymerase, nucleotides and sequencing buffers were from 
USB/Amersham. Oligonucleotides were synthesized on-site usinga Beckman Oligo 
1000 DNA synthesizer. Amplitaq DNA polymerase and PGR reagents were from 
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Perkin-Elmer-Cetus. [^HJChloramphenicol (30.1 Ci/mmol) was from Amersham and [a- 

S]dATP (1000 Ci/mmol) was from New England Nuclear. Oflier chemicals were 
from Sigma. 

5 pRNA122. The key features of this construct are: (1) it contains a copy of the 

rmB operon from pKK3535 (Brosius. J., etal. (1981) Plasmid 6:1 12-1 18.) under 
transcriptional regulation of the lacUVS promoter, (2) it contains a copy of the lactose 
repressor allele lacfi (Calos, M P. (1978) Nature 274:762-769; (3) the chloramphenicol 
acetyltransferase gene {cam) is present and transcribed constitutively from a mutant 
10 tryptophan promoter, trp' (de Boer, H. A., et al. (1983) Proc. Natl Acad. Sci. USA 80:21- 
25); (4) the RBS of the CAT message has been changed from the wild-type, 5' -GGAGG 
to 5' -AUCCC, and the MBS of the 16S rRNA gene has been changed to S' -GGGAU; 
and (5) the p-lactamase gene is present to allow maintenance of plasmids m the host 
strain. 

15 

Bacterial strains and media. Plasmids were maintained and expressed in E. coli 
DH5 {supE44, hsdR17, recAl, endAl. gyrA96. thi-1; Hanahan, D. (1983) J. Mol. Biol. 
166:557-580). Cultures were grown in LB medium (Luiia, S.E. & Bunnus, J.W. (1957) 
J. Bacterial. 74:461-476) or LB medium containing 100 fig/ml ampicillin (LB-AplOO). 

20 To induce synthesis of pbsmid-derived rRNA from (he lacUVS promoter, IPTG was 
added to a final concentration of 1 mM at the times indicated in each experiment 
Strains were transformed by electroporation (Dower, W. J., et al. (1988) Nucl. Acids Res. 
16: 6127) using a Gibco-BRL Cell Porator. Unless otherwise indicated, transformants 
were grown in SOC medium (Hanahan, 1983, supra) for one hour prior to plating on 

25 selective medium to allow expression of plasmid-derived genes. 

Chloramphenicol acetyltransferase assays. CAT activity was determined 
essentially as described (Nielsen, D. A. et al. (1989) Anal. Biochem. 60:191-227). 
Cultures for CAT assays were grown in LB-AplOO. Briefly, 0.5 ml aliquots of mid-log 

30 cultures (unless otherwise indicated) were added to an equal volume of 500 mM Tris- 
HCl (pH8) and lysed using 0.01% (w/v) SDS and chloroform (Miller, J.H. (1992) A 
Short Course in Bacterial Genetics, (Miller, J. H., ed.), pp. 71-80, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY). The resulting lysate was either used 
directly or diluted in assay buffer prior to use. Assay mixtures contained cell extract (5 

35 or 1 0 Ml), 250 mM Tris (pH 8), 214 butyryl-coenzyme A (Bu-CoA), and 40 yM 
HJchloramphenicol in a 125 y.\ volume. Two concentrations of lysate were assayed for 
one hour at 37°C to ensure that the signal was proportional to protein concentrations. 
The product, butyryl-[^ H]chloiamphenicol was extracted into 2,6,10,14- 
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tetramethylpentadecane:xylenes (2:1) and measured directly in a Beckman LS-3801 
liquid scintillation counter. Blanks were prepared exactly as described above, except 
that uninoculated LB medium was used instead of culture. 

5 Minimum inhibitory concentration determination. MICs were determined by 

standard methods in microtiter plates or on solid medium. Overnight cultures grown in 
LB-AplOO were diluted and induced in the same medium containing 1 mM DPTG for 
three hours. Approximately 10 * induced cells were then added to wells (or spotted onto 
solid medium) containing LB-ApIOO + IPTG (1 mM) and chloramphenicol at increasing 
10 concentrations. Cultures were grown for 24 hours and the lowest concentration of 
chloramphenicol diat completely inhibited growth was designated as fhe MIC. 

Random mutagenesis and selection. Random mutagenesis of the 790 loop was 
performed essentially by the method of Higuchi (1989) using PGR and cloned in 

15 pRNA122 using the unique BgM and Drain restriction sites (Higuchi, R. (1989) PCR 
Technoloar (Erlich, H.A., ed.), pp. 61-70, Stockton Press, New York) (Figure 18). For 
each set of mutations, four primers were used: two "outside" primers and two "inside" 
primers. The two outside primers were designed to anneal to eifter side of die Bgin. and 
Drain, restriction sites in pRNA122 (Figure!). These primers were 16S-i)raIII, S' - 

20 GACAATCTGTGTGAGCACTA-3' and 16S-535, 5' - 

TGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGT-3' . The inside primers 
were 16S-786R, 5' -CCTGTTTGCTCCCCACGCTTTCGCACCTGAGCG-3' and 16S- 
ASS-3, 5'- 

CTCAGGTGCGAAAGCGTGGGGAGCAAACAGGNNNNNNNNNCCTGGTAGTCC 
25 ACGCC GTAA-3* (N » A. T, C and G). Hius, 4 ' = 262,144 possible combinations 
were created, with the exception of 320 sequences that were eliminated because they 
formed either BglR or DraM recognition sites (256 BglTl sites and 64 DrdUl sites). 

Transformants were incubated in SOC medium containing 1 mM EPTG for four 
30 hours to induce rRNA synthesis and then plated on LB agar containing 100 ng/ml 

chloramphenicol. A total of 2 x 10 ' transformants were plated yielding approximately 
2000 chloramphenicol-resistant survivors. Next, 736 of these survivors were randomly 
chosen and assayed to determine Hhe MIC of chloramphenicol necessary to completely 
inhibit growdi in cells expressing mutant ribosomes. From diis pool, 1 82 transfoimants 
35 with MICs greater than 100 |ig/ml were randomly selected and sequenced. 

Site-directed mutation of positions 787 and 795. Mutations at positions 787 and 
795 were constmcted as described above for the random mutants, except that the inside 
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primers were 16S-786R (see above) and 16S-ASS-4, 5' - 

CTCAGGTGCGAAAGCGTGGGGAGCAAACAGGNTTAGATANCCTGGTAGTCC 
ACGCCGTAA-3' (N = A, T, C and G). Transformants were selected on LB-AplOO 
agar plates and grouped according to their MICs for cMoramphenicol. Representatives 
5 from each group were tfien sequenced to identify the mutations. 

Primer extension. To determine the ratio of plasmid to chromosome-derived 
rRNA, 30S and 70 S ribosomes were isolated from 200 ml of induced, plasmid 
containing cells by the method of Powers & Noller (1 991). The purified RNA was then 

10 used in primer extension experiments (Trinjan, K., et al. (1989) J. Mol. Biol. 209:643- 
653). End-labeled primers complementary to sequences 3* to the 788 and 795 mutation 
sites were atmealed to rRNA foom induced cells and extended Uuough the mutation site 
using AMV reverse transcriptase. The primers used were: 16S-806R, 5' - 
GGACTACCAGGGTATCT-3' ; 16S-814R, 5' -TACGGCGTGGACTACCA-3'. For 

15 wild-type pRNA122 ribosomes, position 1 192 in the 16S RNA gene was changed from 
C to U and primers were constructed as described above (Triman et al., 1989, supra). 
This mutation has previously been shown not to affect subunit association (Sigmund, 
CD., et al. (1988) Methods Enzymol. 164:673-689). The extension mixture contained a 
mixture of fliree deoxyribonucleotides and one dideoxyribonucleotide. The cDNAs were 

20 resolved by PAGE and the ratios of mutant to non-mutant ribosomes were determined by 
comparing die amount of radioactivity in each of the two bands. 

Oligoribonudeotide synthesis. Oligoribonucleotides were synthesized on solid 
support with the phosphoramidite method (Capaldi, D. & Reese, C. (1994) Nucl. Acids 

25 Res. 22:2209-2216) on a Cruachem PS 250 DNA/KNA synthesizer. Oligomers were 
removed from solid support and deprotected by treatment with ammonia and acid 
following the manufacturer's recommendations. The RNA was purified on a silica gel 
SiSOOF TLC plate (Baker) eluted for five hours with n-propanol/ammonia/water 
(55:35:10, by vol.). Bands were visualized with an ultraviolet lamp and the least mobile 

30 band was cut out and eluted three times with 1 ml of purified water. Oligomers were 
fiirther purified with a Sep-pak C-18 cartridge (Waters) and desalted by continuous-flow 
dialysis (BRL). Purities were checked by analytical C-8 HPLC (Perceptive Biosystems) 
and were greater tiian 95%. 

35 Experimental Procedures 

Sequence analysis of functional mutants. Random mutations were introduced 
simultaneously at all nine positions (787 to 795) in the 790 loop. Functional 
(chloramphenicol-resistant) mutants were then selected in E. coli DH5 cells (Hanahan, 
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1983, supra) and the effects of these mutations on ribosome function were determined. 
A total of 1 82 mutants that retained chloramphenicol resistance were randomly selected 
and sequenced. Wild-type 790-loop sequences were obtained from 81 of the sequaiced 
transfonnants, while the remaining 101 contained mutant sequences. One of the 

5 transformants was chloramphenicol-resistant in flie absence of inducer, presumably due 
to a spontaneous mutation in the CAT gene, and was excluded from further analysis. Of 
100 sequenced functional mutants, 14 were duplicates and four sequences occurred three 
times. Thus, 78 different, functional, 790-loop mutants were analyzed (Figure 19). 
According to resampling theory, this distribution indicates that of the 4 ' = 262,144 

10 possible sequences, only 190 (standard deviation 30) unique sequences exist in the pool 
of selected functional mutants. Of tiie 78 mutants, 44 contained four to six substitutions 
out of the nine bases mutated and 21 of these retained greater than 50% of the wild-type 
activity. The minimal inhibitory concentration (MIC) of chloramphenicol for cells 
expressing wild-type rRNA from pRNA122 is 600 ng/ml. MICs of the mutants ranged 

15 from 1 50 to 550 jig/ml with a mean of 320 ng/ml (standard deviation 89). The median 
and mode were both 350 jig/ml. 

Functional 790-loop mutants showed strong nucleotide preferences at all mutated 
positions, except positions 788 and 792, which showed a random distribution (Figure 20) 
but significant covariation. No mutations were observed at U789 or G791. Mutations at 

20 these positions, however, were present in mutants that were selected for loss of fimction 
(not shown). Thus, these nucleotides appear to be duectly involved in ribosome 
function. U789 is strictly conserved among bacteria but is frequently C789 among other 
organisms (Figure 20). Chemical protection studies have shown that G791 is 
specifically protected from kethoxal modification in 70 S ribosomes and polysomes 

25 (Brow, D. A. & NoUer, H. F. (1983) J. Mol. Biol. 163: 1 12-1 18; Moazed. D. & NoUer. 
RF. (1 986) J. Mol. Biol. 1 91 : 483-493); and by poly(U) (Moazed & Noller, 1 986, 
supra) and that G791 becomes more accessible to kethoxal modification when 30S 
subunits are converted bom flie "inactive" to "active** conformation (Moazed et ai, 
1986, supra). 

30 Purines were strongly selected at position 787 (97.4%) while A and, to a lesser 

extent, C were preferred at position 790 (98.7%) and U was completely excluded at both 
positions. At both position 793 and 795, A, C and U were equally distributed but G was 
selected against. Adenine and uracil were preferred at position 794 (81 .8%). 

Non-random distribution of nucleotides among the selected functional clones 

35 indicates that nucleotide identity affects the level of ribosome function. To examine this, 
the mean activities (MICs) of ribosomes containing all mutations at a given position 
were compared by single-factor analysis of variance between ribosome function (MIC) 
and nucleotide identity at each mutated position. Positions that showed a significant 
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effect of nucleotide identity upon tiie level of ribosome function were 787 (P < 0.001), 
788 (P < 0.05) and 795 (P < 0.001). The absence of mutations at positions U789 and 
G791 in the functional clones prevents statistical analysis of these positions but 
mutations at these positions presumably strongly affect ribosome function as well. 
5 Figure 20 shows a comparison of the selected functional mutants with current 

phylogenetic data (R. Gutell, unpublished results; Gutell, R. R. (1994) Nucl. Acids Res. 
22(17): 3502-3507; Maidak, B. L. et al. (1996) Nucl. Acids Res. 24: 82-85). While 
nucleotide preferences in the selected mutants are similar to those observed in the 
phylogenetic data, the mutant sequences selected in this study show much more 

10 variability tfian those found in nature. This may be because all of Ore positions in the 
loop were mutated simultaneously, allowing normally deleterious mutations in one 
position to be compensated for by mutations at other positions, a process that is unlikely 
to occur in nature, hi addition, none of the mutants was as functional as the wild-type, 
suggesting that wild-type 790-loop sequences have been selected for optimal activity or 

15 fliat other portions of Has tianslational machinery have been optimized to function with 
ihe wild-type sequence. 

To identify potential nucleotide covariation within tiie loop, the paired 
distribution of selected nucleotides was examined for goodness of fit. The most 
significant covariations were observed between positions 787 and 795 (P < O.OOl) and 

20 between positions 790 and 793 (P < 0.001 ). For positions 790 and 793, only eight 

double mutants were available for analysis; therefore, the covariation observed between 
these positions should be regarded with caution. Position 788, which showed no 
nucleotide specificity, did show significant covariation wiA positions 787 (P < 0.01), 
794 (P < 0.01) and 795 (P < 0.01). 

25 

Analysis of site-directed mutations constructed at the base of the loop: 
Functional analysis of mutations at positions 787 and 795. The observed 
covariations among positions 787, 788 and 795 are particularly interesting, since 

30 nucleotide identity at these positions correlated with tiie level of ribosome function. 
Further analysis of nucleotides at positions 787 and 795 revealed that 72 of ihe 78 
functional mutants have the potential to form mismatched base-pairs (A ■ C, G • U, A • A 
and G - G). Other mismatches, such as G - A and U ■ G, however, were not found. In 
addition, only four sequences with an A • U Watson-Crick pair and no sequences with a 

35 U ■ A, G • C or C • G pair were present, suggesting that strong base-pairs between these 
positions inhibit ribosome function. Therefore all possible nucleotide combinations at 
positions 787 and 795 were constructed and analyzed without changing other nucleotides 
in the 790 loop. Ribosome function of the mutants (Figure 21) varied from 84% (A • A) 
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to 1% (C • G) of the wild-type. As predicted by analysis of the pool of functional 
random mutants, site-directed mutants witii G • C, C • G and U • A Watson-Crick pairs 
between positions 787 and 795 were strongly inhibitoiy. 

5 Results 

These data suggest that strong pairing between nucleotides at positions 787 and 
795 inhibits ribosome function. In addition, some of the site-directed substitutions at 
positions 787 and 795 that produced fimctional ribosomes were largely excluded from 
the pool of mutants in which all of the loop positions were mutated simultaneously (e.g. 

1 0 CC, CU, UU and UC). The observed nucleotide preferences at positions 787 and 795 in 
fhe selected random pool presumably reflect interaction of nucleotides at these positions 
wiA other nucleotides in fhe loop. This is consistent wilfa our findings of extensive 
covariations among these sites. 

Perturbations of the 790 loop have been shown to affect ribosomal subunit 

15 association (Herr, W., et al. (1979) / Mol. Biol. 130: 433-449; Tapprich, W. & Hill, W., 
(1986) Proc. Natl Acad. Sci. USA 83: 556-560; Tapprich, W., et al. (1989) Proc. Natl 
Acad. Sci. USA 86: 4927-4931). Therefore several of flie 787 to 795 mutants were tested 
for dieir ability to form 70 S ribosomes. Ribosomes were isolated torn selected mutants 
and the distribution of mutant ribosomes in both the 70 S and 30S peaks was determined 

20 by primer extension (Figure 21). These data show that CAT activity correlates with the 
presence of mutant 30S subunits in the 70 S ribosome pool. Thus, loss of function may 
be due to the inability of mutant 30S and 50 S subunits to associate. Another 
explanation for this observation is that the mutations may directly affect a stage of the 
protein synthesis process prior to subunit association, such as initiation, which prevents 

25 subsequent steps fiom occurring. Other mutations in the 1 6S rRNA have been identified 
for which this appears to be die case (Cunningham, P., et d. (1993) Biochemistry 32: 
7172-7180). 

The references cited in Example 5 may be found in Lee, K. et al, J. Mol Biol. 
269: 732-743 (1 997), expressly incorporated by reference herein. 

30 

EXAMPLE 6: CONSTRUCTION OF A HYBRID CONSTRUCT 

A plasmid construct of the i»:esent invention identified as the hybrid construct, is 
set forth in Figures 1 7 and 25. This hybrid constract contains a 1 6S rRNA firom 
35 Mycobacterium tuberculosis. The specific sites on the hybrid construct are as follows: 
the part of rRNA from E. coli tmB operon corresponds to nucleic acids 1-93 1; the part 
of 16S rRNA from Mycobacterium tuberculosis rm operon corresponds to nucleic acids 
932-1542; the 16S MBS GGGAU corresponds to nucleic acids 1536-1540; the 
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terminator Tl of £. coli rmB operon corresponds to nucleic acids 1791-1834; the 
terminator T2 of E. coli rmB operon corresponds to nucleic acids 1 965-1994; the 
replication origin corresponds to nucleic acids 3054-2438; the bla O-lactamase; 
ampicillin resistance) corresponds to nucleic acids 3214-4074; the GFP corresponds to 

5 nucleic acids 5726-4992; the GFP RBS (ribosome binding sequence) AUCCC 

corresponds to nucleic acids 5738-5734; the trp'^ promoter corresponds to nucleic acids 
5795-5755; the trp" promoter corresponds to nucleic acids 6270-6310; the CAT RBS 
(ribosome binding sequence) AUCCC corresponds to nucleic acids 6327-6331; the cam 
(chloramphenicol acetyltransferase; CAT) corresponds to nucleic acids 6339-6998; the 

10 lacP promoter corresponds to nucleic acids 7307-7384; the lacP (lac repressor) 
corresponds to nucleic acids 7385-8467; and the lac UV5 promoter corresponds to 
nucleic acids 8510-8551. 

All references cited herein are expressly incorporated by reference. 

15 Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, many equivalents to the specific embodiments of the invention 
described herein. Such equivalents aie intended to be encompassed by the following 
claims. 
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What is claimed: 



1 . A plasmid comprising an rRNA gene having a mutant Anti-Shine-Dalgamo 
5 sequence, at least one mutation in said rRNA gene, and a genetically engineered gene 
which encodes a selectable marker having a mutant Shine-Dalgamo sequence, wherein 
the mutant Anti-Shine-Dalgamo and the mutant Shine-Dalgamo sequence are a mutually 
compatible pair. 

10 2. The plasmid of claim 1, wherein the rRNA gene is fiom a species selected from 
the group consisting of Mycobacterium tuberculosis, Pseudomonas aeruginosa. 
Salmonella typhi, Yersenia pestis, Staphylococcus aureus, Str^tococcus pyogenes, 
Enterococcus faecalis. Chlamydia trachomatis. Saccharomyces cerevesiae. Candida 
albicans, and trypanosome. 

IS 

3. The plasmid of claim 1, wherein the selectable marker is chosen fiom the group 
consisting of chloramphenicol acetyltransferase (CAT), green fluorescent protein (GFP), 
and botii CAT and GFP. 

20 4. The plasmid of claim 1 , wherein the mutant Anti-Shine-Dalgamo sequence is 
selected fix>m the group consisting of the sequences set forth in Figures 12, 13, IS, and 
16. 

5. The plasmid of claim 1, wherein the mutant Shine-Dalgamo sequence is selected 
75 from the group consisting of the sequences set forth in Figures 12, 13, IS, and 16. 

6. The plasmid of claim 1, wherein the mutant Anti-Shine-Dalgamo sequence and 
the mutant SD sequence are a mutually compatible pair selected fcata Has group 
consisting of the sequences set forth in Figures 12, 13, IS, and 16. 

30 

7. The plasmid of claim 6, wherein the mutually compatible mutant Shine-Dalgamo 
and mutant Anti-Shine-Dalgamo pair permits translation by the rRNA of the selectable 
marker. 

3S 8. The plasmid of claim 3, wherein die selectable marker is CAT. 



9. 



The plasmid of claim 3, wherein the selectable marker is GFP. 
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10. A cell comprising the plasmid of claim 1 . 

11. The cell of claim 10, wherein the mutations in the rRNA gene affect the quantity 
of selectable marker produced. 

5 

1 2. The cell of claim 10, wherein die cell is a bact^al cell. 

13. The plasmid of claim 1 , wherein the DNA sequence encoding the rRNA gene is 
under the control of an inducible promoter. 

10 

14. A plasmid comprising an E. coli 16S rRNA gene having a mutant Anti-Shine- 
Dalgamo sequence, at least one mutation in said 16S rRNA gene, and a genetically 
engineered gene which encodes GFP having a mutant Shine-Dalgamo sequence, wherein 
the mutant Anti-Shine-Dalgamo and the mutant Shine-Dal^nno sequence are a mutually 

IS compatible pair. 

15. The plasmid of claim 14, wherein the mutant Anti-Shine-Dalgamo sequence is 
selected from the gioiq) consisting of the sequences set forth in Figures 12, 13, IS, and 
16. 

20 

1 6. The plasmid of claim 14, wherein the mutant Shine-Dalgamo sequence is 
selected from the group consisting of die sequences set forth in Figures 12, 13, 15, and 
16. 

25 1 7. The plasmid of claim 14, wherein die mutant Anti-Shine-Dalgamo sequence and 
the mutant Shine-Dalgamo sequence are a mutually compatible pair selected from the 
group consisting of the sequences set forth in Figures 12, 13, 15, and 16. 

1 8. The plasmid of claim 17, wherein the mutually compatible mutant Shine- 

30 Dalgamo and mutant Anti-Shine-Dalgamo pair permits translation by the mutant 16S 
rRNA of tiie selectable marker GFP. 

19. A cell comprising the plasmid of claim 14. 

35 20. The cell of claim 1 9, wherein the mutation in the 1 6S rRNA gene affects the 
quantity of selectable marker produced. 



2 1 . The cell of claim 1 9, wherein the cell is a bacterial cell. 
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22. The plasmid of claim 14, wherein the DNA sequence encoding the 1 6S rRNA 
gene is under the control of an inducible promoter. 

23 . A method for identifying functional mutant ribosomes comprising: 

(a) transforming a host cell witfi a plasmid comprising an rRNA gene having 
a mutant Anti-Shine-Dalgamo sequence, at least one mutation in said 
rRNA gene, and a genetically engineered gene which encodes a selectable 
marker having a mutant Shine-Dalgamo sequence, wherein the mutant 
Anti-Shine-Dalgamo and the mutant Shine-Dalgamo sequence are a 
mutually compatible pair, 

(b) isolating cells via the selectable maiker, and 

(c) identi^dng the rRNA fixnn the cells from step (b), ttiereby identifying 
fimctional mutant libosomes. 

24. A method for identifying fimctional mutant ribosomes comprising: 

(a) transforming a host cell with a plasmid comprising an E. coli 1 6S rRNA 
gene having a mutant Anti-Shine-Dalgamo sequence, at least one 
mutation in said 16S rRNA gene, and a genetically engineered gene 
which encodes GFP having a mutant Shine-Dalgamo sequence, wherein 
the mutant Anti-Shine-Dalgamo and the mutant Shine-Dalgamo sequence 
are a mutually compatible pair, 

(b) isolating cells via the GFP; and 

(c) identifying the rRNA fiom flie cells from step (b), thereby identifying 
fimctional mutant ribosomes. 

25. A method for identifying functional mutant ribosomes that may be suitable as 
dmg targets comprising: 

(a) transforming a host cell with a plasmid comprising an rRNA gene having 
a mutant Anti-Shine-Dalgamo sequence, at least one mutation in said 
rRNA gene, and a genetically engineered gene which encodes a selectable 
marker having a mutant Shine-Dalgamo sequence, wherein the mutant 
Anti-Shine-Dalgamo and the mutant Shine-Dalgamo sequence ate a 
mutually compatible paii^ 

(b) isolating cells via the selectable marker, 

(c) identifying and sequencing the rRNA fi-om the cells from step (b), thereby 
identifying regions of interest; 

(d) selecting regions ofinterestfirom step (c); 
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(e) mutating the regions of interest of step (d); 

(f) inserting the mutated regions of interest from step (e) into a plasmid 
comprising an rRNA gene having a mutant Anti-Shine-Dalgamo 
sequence and a genetically engineered gene which encodes a selectable 
marker having a mutant Shine-Dalgamo sequence, wherein the mutant 
Anti-Shine-Dalgamo and the mutant Shine-Dalgamo sequence are a 
mutually compatible pair, 

(g) transforming a host cell with the plasmid from step (f); 

(h) isolating cells of step (g) via the selectable marker, and 

(i) identiiying the rRNA from step (h), thereby identifying functional mutant 
ribosomes that may be suitable as drug targets. 

26. A method for identifying functional mutant ribosomes diat may be suitable as 
drug targets comprising: 

(a) transforming a host cell widi a plasmid comprising an E. coli 168 rRNA 
gene having a mutant Anti-Shine-Dalgamo sequence, at least one 
mutation in said 16S rRNA gene, and a genetically engineered gene 
which encodes GFP having a mutant Shine-Dalgamo sequence, wherein 
the mutant Anti-Shine-Dalgamo and the mutant Shine-Dalgamo sequence 
are a mutually compatible pair, 

(b) isolating cells via the GFP; 

(c) identifying and sequencing the rRNA from tiie ceUs from step (b), thereby 
identifying regions of interest 

(d) selecting the regions of interest from step (c); 

(e) mutating the regions of interest from step (d); 

(f) inserting the mutated regions of interest from step (e) into a plasmid 
comprising an E. coli I6S rRNA gene having a mutant Anti-Shine- 
Dalgamo sequence and a genetically engineered gene which encodes GFP 
having a mutant Shine-Dalgamo sequence, wherein the mutant Anti- 
Shine-Dalgamo and the mutant Shine-Dalgamo sequence are a mutually 
compatible pair; 

(g) transforming a host cell with the plasmid from step (f); 

(h) isolating cells of step (g) via the GFP; and 

(i) identifying the rRNA from step (h), diereby identifying fimctional mutant 
ribosomes that may be suitable as drag targets. 



wo 2004/0035H 



PCTAJS2003/020963 



27. A method for identifying drag candidates comprising: 

(a) transfonning a host cell wifh the plasmid of claim 1 ; 

(b) isolating cells via Ae selectable marker; 

(c) identifying and sequencing the rRNA fiom step (b) to identify the regions 
of interest; 

(d) selecting regions of interest from step (c); 

(e) mutating the regions of interest from step (d); 

(f) inserting the mutated regions of interest from step (e) into a plasmid 
comprising an rRNA gene having a mutant Anti-Shine-Dalgamo 
sequence and a geneticaUy engineered gene which encodes a selectable 
maiker having a mutant Shine-Dal^imo sequence, wherein the mutant 
Anti'Shine-Dalgamo and the mutant Shine-Dalgamo sequence are a 
mutually compatible pair; 

(g) transforming a host cell with the plasmid fijom step (f); 

(h) isolating cells from step (g) via die selectable marker, 

(i) identifying the rRNA fiom step (h) to idratify the functional mutant 



(j) screening drug candidates against functional mutant ribosomes from step 

(i); 

(k) identifying the drug candidates from step (j) that bound to the functional 

mutant ribosomes from step (i); 
(1) screening the drug candidates from step (k) against a human rRNA; and 
(m) identifying the drug candidates from step (I) that do not bind to the human 

rRNA, Aereby identifying dmg candidates. 

A method for identifying drug candidates comprising: 

(a) transforming a host cell with the plasmid of claim 14; 

(b) isolating cells via the selectable marker; 

(c) identifying and sequencing the iRNA from step (b) to identify the regions 
of interest; 

(d) selecting the regions of interest from step (c); 

(e) mutating the regions of interest from step (d); 

(f) inserting the mutated regions of interest fiom step (e) into a plasmid 
comprising an E. coli 1 6S rRNA gene having a mutant Anti-Shine- 
Dalgamo sequence and a genetically engineered gene which encodes GFP 
having a mutant Shine-Dalgamo sequence, wherein the mutant Anti- 
Shine-Dalgamo and the mutant Shine-Dalgamo sequence are a mutually 
compatible pair; 
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(g) transfonning a host cell with the plasmid from step (f); 
Qi) isolating cells from step (g) via the selectable marker; 
(i) identifying die rRNA from step (h) to identify Hbe fimctional mutant 



(j) screening drag candidates against the frmctional mutant ribosomes from 
step(i); 

(k) identifying the drug candidates from step (j) that bound to (he functional 

mutant ribosomes from step (i); 
(1) screening the drug candidates from step (k) against a human 1 6S rRNA; 

and 

(m) identifying the drug candidates from step (1) that do not bind to the human 
16S rRNA, thereby identifying drug candidates. 
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Description 


1-1542 
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1536-1540 


16S MBS (message binding sequence) GGGAU 
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16S-23S spacer region 


19834886 


23S rRNA of Eschemhia coff rmB operon 
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f/pc promoter 
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f/pc promoter 
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CAT RBS (ribosome binding sequence) AUCCC 
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cam (chloramphemcol acetyltransferase: CAT) 
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lacfi promoter 
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lacP (lac repressor) 


12985-13026 


tec W5 promoter 
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MBS^message Unding siteaAnfi-Shine-Oalgamo sequence 

RBS=ribosome binding sttesShine-Dalgamo sequence 

Fig. I 
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Fig. 7 
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Fig. 8 
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Nucleotide 


Dsscrfptfon 


1-931 


part or IBS iRNA from Escheitehia coft' rmB operon 


932-1542 


part of 16S rRNA from Mycobadaiim (utercuftuts rm operon 


1536-1540 


16S MBS (message binding sequence) GGGAU 


1791-1834 


terminator T1 of Esdwrichia cob' rmB operon 


196S-1994 


tem^ator T2 of Esc/iemfifa coft'rmB operon 


3054-2438 


repficaiion origin 


3214^74 


bla (p-lactamase; amplcBlin resistance) 


5726-4992 


GFP (Green Ruorescent Protein) 


5738-5734 


GFP RBS (ritnsome binding sequence) AUCCC 


5795-5755 


tfpc promoter 


6270-6310 


t/pc promoter 


6327-6331 
6333-6998 


CAT RBS (ribosome bitiding sequeruje) AUCCC 
cam (chloramphenicol acetyllransferase; CAT) 


7307-7384 


/acPpromoter 


7385-8467 


/ac/' (lac repressor) 


8510-8551 


<9c(A'5 promoter 
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GACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACT 

TGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGT 

AAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAAC 

TTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACA 

ACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGA 

AGCCATACCAAACGACGAGCGTGACACCACGATGCCTGCAGCAATGGCAAC 

AACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAA 

CAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCT 

CGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCG 

TGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGT 

ATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAAT 

AGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGG TAACT GTCAG 

ACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTrCATTTTTAATTT 

AAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTT 

AACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCTTAATAAGATGATCTTCT 

TGAGATCGTTTTGGTCTGCGCGTAATCTCTTGCTCTGAAAACGAAAAAACCG. 

CCTTGCAGGGCGGTTmCGAAGGTTCTCTGAGCTACCAACTCTTTGAACCGA 

GGTAACTGGCTTGGAGGAGCGCAGTCACCAAAACTTGTCCTTTCAGTTTAGC 

CTTAACCGGCGCATGACTTCAAGACTAACTCCTCTAAATCAA1TACCAGTGG 

CTGCTGCCAGTGGTGCTTTTGCATGTCTTTCCGGGTTGGACTCAAGACGATAG 

TTACCGGATAAGGCGCAGCGGTCGGACTGAACGGGGGGTTCGTGCATACAG 

TCCAGCTTGGAGCGAACTGCCTACCCGGAACTGAGTGTCAGGCGTGGAATGA 

GACAAACGCGGCCATAACAGCGGAATGACACCGGTAAACCGAAAGGCAGGA 

ACAGGAGAGCGCACGAGGGAGCCGCCAGGGGGAAACGCCTGGTATCTTTAT 

AGTCCTGTCGGGTTTCGCCACCACTGATTTGAGCGTCAGATTTCGTGATGCTT 

GTCAGGGGGGCGGAGCCTATGGAAAAACGGCTTTGCCGCGGCCCTCTCACTT 

CCCTGTTAAGTATCTTCCTGGCATCTTCCAGGAAATCTCCGCCCCGTTCGTAA 

GCCATTTCCGCTCGCCGCAGTCGAACGACCGAGCGTAGCGAGTCAGTGAGCG 

AGGAAGCGGAATATATCCTGTATCACATATTCTGCTGACGCACCGGTGCAGC 

CTTTTTTCTCCTGCCACATGAAGCACrrCACTGACACCCTCATCAGTGCCAAC 

ATAGTAAGCCAGTATACACTCCGCTAGCATCGTCCAITCCGACAGCATCGCC 

AGTCACTATGGCGTGCTGCTAGCGCTATATGCGTTGATGCAATTTCTATGCGC 

ACCCGTTCTCGGAGCACTGTCCGACCGCTTTGGCCGCCGCCCAGTCCTGCTCG 

CTTCGCTACTTGGAGCCACTATCGACTACGCGATCATGGCGACCACACCCGT 

CCTGTGGATCCTCTACGCCGGACGCATCGTGGCCGGCCACGATGCGTCCGGC 

GTAGAGGATCTATTTAACGACCCtGCCCTGAACCGACGACCGGGTCGAATTT 

GCTTTCGAATTTCTGCCATTCATCCGCTTATTATCACTTATTCAGGCGTAGCA 

CCAGGCGTTTAAGGGCACCAATAACTGCCTTAAAAAAATTACGCCCCGCCCT 

GCCACTCATCGCAGTACTGTTGTAATTCATTAAGCATTCTGCCGACATGGAA 

GCCATCACAGACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTG 

TCGCCTTGCGTATAATATTTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGT 

CCATATTGGCCACGTTTAAATCAAAACTGGTGAAACTCACCCAGGGATTGGC 

TGAGACGAAAAACATATTCTCAATAAACCCTTTAGGGAAATAGGCCAGGTTT 

TCACCGTAACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGGAAAT 

CGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATGGAA 

Fig. 22 
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AACGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTCTTTC 

ATTGCCATACGGAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAA 

TAAAGGCCGGATAAAACTTGTGCTTATTTTTCTTTACGGTCTTTAAAAAGGCC 

GTAATATCCAGCTGAACGGTCTGGTTATAGGTACATTGAGCAACTGACTGAA 

ATGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTGGTATAT 

CCAGTGATTTTTTTCTCCATTTCTCGAGCACACTGAAAGCGGCCGCTTCCACA 

CATTAAACTAGTTCGATGATTAATTGTCAACAGCTCGCCGCTATATGCGTTGA 

TGCAATTTCTATGCGCACCCGTTCTCGGAGCACTGTCCGACCGCTTTGGCCGC 

CGCCCAGTCCTGCTCGCTTCGCTACTTGGAGCCACTATCGACTACGCGATCAT 

GGCGACCACACCCGTCCTGTGGATCCCAGACGAGTTAAGTCACCATACGTTA 

GTACAGGTrGCCACTCTTTTGGCAGACGCAGACCTACGGCTACAATAGCGAA 

GCGGTCCTGGTATTCATGTTTAAAAATACTGTCGCGATAGCCAAAACGGCAC 

TCnrrGGCAGTTAAGCGCACTTGCTTGCCTGTCGCCAGTTCAACAGAATCAAC 

ATAAGCGCAAACTCGCTGTAATTCTACGCCATAAGC ACCAA TATTCTGGATA 

GGTGATGAGCCGACACAACGAGGAATTAATGCCAGATT TTCCA GACCAGGC 

ATACCnTCCTGCAAAGTGTATmACCAGACGATGCCAGTTTTCTCCGGCTCC 

TACATGTAAATACCACGCATCAGGTTCATCATGAATTTCGATACCnTGATCC 

GGTTGATGATCACCGTGCCGCGATAGTCCTCCAGAAAAAGTACATTACTTCC 

TTCACCCAGAATAAGAACGGGTTGTCCTTCTGCGGTTGCATACTGCCAGGCA 

TTGAGTAATTGTTGTTCGTCTTCGGCACATACAATGTGCTGAGCATTATGATC 

AATGCCAAATGTGTTCCAGGGTTTTAAGGAGTGGTTCATAGCTGCTTTCCTGA 

TGCAAAAACGAGGCTAGTTTACCGTATCTGTGGGGGGATGGCTTGTAGATAT 

GACGACAGGAAGAGTTTGTAGAAACGCAAAAAGGCCATCCGTCAGGATGGC 

CTTCTGCTTAATTTGATGCCTGGCAGTTTATGGCGGGCGTCCTGCCCGCCACC 

CTCCGGGCCGTTGCTTCGCAACGTTCAAATCCGCTCCCGGCGGATTTGTCCTA 

CTCAGGAGAGCGTTCACCGACAAACAACAGATAAAACGAAAGGCCCAGTCT 

TTCGACTGAGCCTTTCGTTTTATTTGATGCCTGGCAGTTCCCTACTCTCGCAT 

GGGGAGACCCCACACTACCATCGGCGCTACGGCGTTTCACTTCTGAGTTCGG 

CATGGGGTCAGGTGGGACCACCGCGCTACTGCCGCCAGGCAAATTCTGTTTT 

ATCAGACCGCTTCTGCGTTCTGATTTAATCTGTATCAGGCTGAAAATCTTCTC 

TCATCCGCCAAAACAGCTTCGGCGTTGTAAGGTTAAGCCTCACGGTTCATTA 

GTACCGGTTAGCTCAACGCATCGCTGCGCTTACACACCCGGCCTATCAACGT 

CGTCGTCTTCAACGTTCCTTCAGGACCCTTAAAGGGTCAGGGAGAACTCATC 

TCGGGGCAAGTTTCGTGCTTAGATGCTTTCAGCACTTATCTCTTCCGCATTTA 

GCTACCGGGCAGTGCCATTGGCATGACAACCCGAACACCAGTGATGCGTCCA 

CTCCGGTCCTCTCGTACTAGGAGCAGCCCCCCTCAGTTCTCCAGCGCCCACG 

GCAGATAGGGACCGAACTGTCTCACGACGTTCTAAACCCAGCTCGCGTACCA 

CTTTAAATGGCGAACAGCCATACCCTTGGGACCTACTTCAGCCCCAGGATGT 

GATGAGCCGACATCGAGGTGCCAAACACCGCCGTCGATATGAACTCTTGGGC 

GGTATCAGCCTGTTATCCCCGGAGTACCTTTTATCCGTTGAGCGATGGCCCTT 

CCATTCAGAACCACCGGATCACTATGACCTGCTTTCGCACCTGCTCGCGCCGT 

CACGCTCGCAGTCAAGCTGGCTTATGCCATTGCACTAACCTCCTGATGTCCG 

ACCAGGATTAGCCAACCTTCGTGCTCCTCCGTTACTCTTTAGGAGGAGACCG 

CCCCAGTCAAACTACCCACCAGACACTGTCCGCAACCCGGATTACGGGTCAA 

CG1TAGAACATCAAACATTAAAGGGTGGTATTTCAAGGTCGGCTCCATGCAG 
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ACTGGCGTCCACACTTCAAAGCCTCCCACCTATCCTACACATCAAGGCTCAA 

TGTTCAGTGTCAAGCTATAGTAAAGGTTCACGGGGTCTTTCCGTCTTGCCGCG 

GGTACACTGCATCTTCACAGCGAGTTCAATTTCACTGAGTCTCGGGTGGAGA 

CAGCCTGGCCATCATTACGCCATTCGTGCAGGTCGGAACTTACCCGACAAGG 

AATTTCGCTACCTTAGGACCGTTATAGTTACGGCCGCCGTTTACCGGGGCTTC 

GATCAAGAGCTTCGCTTGCGCTAACCCCATCAATTAACCTTCCGGCACCGGG 

CAGGCGTCACACCGTATACGTCCACnTTCGTGTTTGCACAGTGCTGTGTTTTT 

AATAAACAGTTGCAGCCAGCTGGTATCTTCGACTGATTTCAGCTCCATCCGC 

GAGGGACCTCACCTACATATCAGCGTGCCTTCTCCCGAAGTTACGGCACCAT 

nTGCCTAGTTCCTTCACCCGAGTTCTCTCAAGCGCCTTGGTATTCTCTACCTG 

ACCACCTGTGTCGGTTTGGGGTACGATTTGATGTTACCTGATGCTTAGAGGCT 

TTTCCTGGAAGCAGGGCATTTGTTGCTTCAGCACCGTAGTGCCTCGTCATCAC 

GCCTCAGCCTTGATTTTCCGGATTTGCCTGGAAAACCAGCCTACACGCTTAA 

ACCGGGACAACCGTCGCCCGGCCAACATAGCCTTCTCCGTCCCCCCTTCGCA 

GTAACACCAAGTACAGGAATATTAACCTGTTTCCCATCGACTACGCCTTTCG 

GCCTCGCCTTAGGGGTCGACTCACCCTGCCCCGATTAACGTTGGACAGGAAC 

CCTTGGTCTTCCGGCGAGCGGGCTTTTCACCCGCTTTATCGTTACTTATGTCA 

GCATTCGCACTTCTGATACCTCCAGCATGCCTCACAGCACACCTTCGCAGGCT 

TACAGAACGCTCCCCTACCCAACAACGCATAAGCGTCGCTGCCGCAGCTTCG 

GtGCATGGTTTAGCCCCGTTACATCTTCCGCGCAGGCCG.ACTCGACCAGTGA 

GCTATTACGCTTTCTTTAAATGATGGCTGCTTCTAAGCCAACATCCTGGCTGT 

CTGGGCCTTCCCACATCGTTTCCCACTTAACCATGACTTTGQGACCTTAGCTG 

GCGGTCTGGGTTGTTTCCCTCTTCACGACGGACGTTAGCACCCGCCGTGTGTC 

TCCCGTGATAACATTCTCCGGTATTCGCAGTTTGCATCGGGTTGGTAAGTCGG 

GATGACCCCCTTGCCGAAACAGTGCTCTACCCCCGGAGATGAATTCACGAGG 

CGCrACCTAAATAGCTTTCGGGGAGAACCAGCTATCTCCCGGTTTGATTGGC 

CmCACCCCCAGCCACAAGTCATCCGCTAATTTTTCAACATTAGTCGGTTCG 

GTCCTCCAGTTAGTGTTACCCAACCTTCAACCTGCCCATGGCTAGATCACCGG 

GTTTCGGGTCTATACCCTGCAACTTAACGCCCAGTTAAGACTCGGTTTCCCTT 

CGGCTCCCCTATTCGGTTAACCTTGCTACAGAATATAAtiTCGCTGACCCATTA 

TACAAAAGGTACGCAGTCACACGCCTAAGCGTGCTCCCACTGCTTGTACGTA 

CACGGTTTCAGGTrCTTTTTCACTCCCCTCGCCGGGGTTCTTTTCGCCTTTCCC 

TCACGGTACTGGTTCACTATtGGTCAGTCAGGAGTATTTAGCCTTGGAGGAT 

GGTCCCCCCATATTCAGACAGGATACCACGTGTCCCGCCCTACTCATCGAGC 

TCACAGCATGTGCATTTTTGTGTACGGGGCTGTCACCCTGTATCGCGCGCCTT 

TCCAGACGCTTCCACTAACACACACACTGATTCAGGCTCTGGGCTGCTCCCC 

GTTCGCTCGCCGCTACTGGGGGAATCTCGGTTGATTTCTTTTCCTCGGGGTAC 

TTAGATGTTTCAGTTCCCCCGGTTCGCCTCATTAACCTATGGATTCAGTTAAT 

GATAGTGTGTCGAAACACACTGGGTTTCCCCATTCGGAAATCGCCGGTTATA 

ACGGTTCATATCACCTTACCGACGCTTATGGCAGATTAGCACGTCCTTCATCG 

CCTCTGACTGCCAGGGCATCCACCGTGTACGCTTAGTCGCTTAACCTCACAA 

CCCGAAGATGmCTTTCGATTCATCATCGTGTTGCGAAAATTTGAGAGACTC 

ACGAAGAACTCTCGTTGTTCAGTGTTTCAATTTTCAGCtTGATCCAGATTTTT 

AAAGAGCAAAAATCTCAAACATCACCCGAAGATGAGTTTTGAGATATTAAG 

GTCGGCGACTTTCACTCACAAACCAGCAAGTGGCGTCCCCTAGGGGATTCGA 
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ACCCCTGTTACCGCCGTGAAAGGGCGGTGTCCTGGGCCTCTAGACGAAGGGG 

ACACGAAAATTGCTTATCACGCGTTGCGTGATATTTTCGTGTAGGGTGAGCTT 

TCATTAATAGAAAGCGAACGGCCTTATTCTCTTCAGCCTCACTCCCAACGCGT 

AAACGCCTTGCTnrCACTTTCTATCAGACAATCTGTGTGAGCACTACAAAGT 

ACGCTTCTTTAAGGTAAGTGTGTGATCCAACCGCAGGTTCCCCTACGGTTACC 

TTGTTACGACTTCACCCCAGTCATGAATCACAAAGTGGTAAGCGCCCTCCCG 

AAGGTTAAGCTACCTACTTCTTTTGCAACCCACTCCCATGGTGTGACGGGCG 

GTGTGTACAAGGCCCGGGAACGTATTCACCGTGGCATTCTGATCCACGATTA 

CTAGCGATTCCGACTTCATGGAGTCGAGTTGCAGACTCCAATCCGGACTACG 

ACGCACnTATGAGGTCCGCTTGCTCTCGCGAGGTCGCTTCTCTTTGTATGCG 

CCATTGTAGCACGTGTGTAGCCCTGGTCGTAAGGGCCATGATGACTTGACGT 

CATCCCCACCTTCCTCCAGTTTATCACTGGCAGTCTCCTTTGAGTTCCCGGCC 

GGACCGCTGGCAACAAAGGATAAGGGTTGCGCTCGTTGCGGGACTTAACCC 

AACATTTCACAACACGAGCTGACGACAGCCATGCAGCACCTGTCTCACGGTT 

CCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGT 

AAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGG 

GCCCCCGTCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGT 

CGACTTAACGCGTTAGCTCCGGAAGCCACGCCTCAAGGGCACAACCTCCAAG 

TCGACATCGTTTACGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCCCC 

ACGCnTCGCACCTGAGCGTCAGTCTTCGTCCAGGGGGCCGCCTTCGCCACC 

GGTATTCCTCCAGATCTCTACGCAnTCACCGCTACACCTGGAATTCTACCCC 

CCTCTACGAGACTCAAGCTTGCCAGTATCAGATGCAGTTCCCAGGTTGAGCC 

CGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAG 

TAATrCCGATTAACGCTTGCACCCTCCGTATTACCGCGGCTGCTGGCACGGA 

GTTAGCCGGTGCTTCTTCTGCGGGTAACGTCAATGAGCAAAGGTATTAACTT 

TACTCCCTTCCTCCCCGCTGAAAGTACTTTACAACCCGAAGGCCTTCTTCATA 

CACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTG 

CTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTCCAGTGTGGCTGGTCATC 

CTCTCAGACCAGCTAGGGATCGTCGCCTAGGTGAGCCGTTACCCCACCTACT 

AGCTAATCCCATCTGGGCACATCCGATGGCAAGAGGCCCGAAGGTCCCCCTC 

TTTGGTCTTGCGACGTTATGCGGTATTAGCTACCGTTrCCAGTAGTTATCCCC 

CTCCATCAGGCAGTTTCCCAGACATTACTCACCCGTCCGCCACTCGTCAGCA 

AAGAAGCAAGCTTCTTCCTGTTACCGTTCGACTTGCATGTGTTAGGCCTGCCG 

CCAGCGTTCAATCTGAGCCATGATCAAACTCTTCAATTTAAAAGTTTGACGCT 

CAAAGAATTAAACTTCGTAATGAATTACGTGTTCACTCTTGAGACTTGGTATT 

CATTTTTCGTCTTGCGACGTTAAGAATCCGTATCTTCGAGTGCCCACACAGAT 

TGTCTGATAAATTGTTAAAGAGCAGTGCCGCTTCGCTTTTTCTCAGCGGCCGC 

TGTQTGAAATTGTTATCCGCTCACAATTCCACACATTATACGAGCCGGAAGC 

ATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTG 

CGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCAT 

TAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCCAG 

GGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCACCG 

CCTGGCCCTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAG 

GCGAAAATCCTGTTTGATGGTGGTTGACGGCGGGATATAACATGAGCTGTCT 

TCGGTATCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGCAGCCCGG 
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ACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAG 

CATCGCAGTGGGAACGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAA 

CCGGACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATT 

GCGAGTGAGATATTTATGCCAGCCAGCCAGACGCAGACGCGCCGAGACAGA 

ACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGACCCAATGCGACCAGA 

TGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGA 

TGGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGG 

CAGCTTCCACAGCAATGGCATCCTGGTCATCCAGCGGATAGTTAATGATCAG 

CCCACTGACCCGTTGCGCGAGAAGATTGTGCACCGCCGCTTTACAGGCTTCG 

ACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGG 

CGCGAGATTTAATCGCCGCGACAATTTGCGACGGCGCGTGCAGGGCCAGACT 

GGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTTGTGCC 

ACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCG 

CGTTTTCGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTGA 

TAAGAGACACCGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCACAT 

TCACCACCCTGAATTGACTCTCTTCCGGGCGCTATCATGCCATACCGCGAAA 

GGTTTTGCACCATTCGATGGTGTCGGATCCTAGAGCGCACGAATGAGGGCCG 

ACAGGAAGCAAAGCTGAAAGGAATCAAATTTGGCCGCAGGCGTACCGTGGA 

CAGGAACGTCGTGCTGACGCTTCATCAGAAGGGCACTGGTGCAACGGAAATT 

GCTCATCAGCTCAGTATTGCCCGCTCCACGGTTTATAAAATTCTTGAAGACG 

AAAGGGGCTCGTGCATACGCCTATTTTTATAGGTTAATGTCATGATAATAAT 

GGTTTCITAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCT 

ATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGGTCATGAGACAATA 

ACCCTGATAAATGCTTCAATAAT ATTGAA AAAGG AAGA GTATGAGT ATTCAA 

CATTTCCGTGTCGCCCTTATTCCCrn"rrrGCGGCA'il''ri GCCTTCCTGi''l l l I' 

GCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGT 

GCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGA 

GTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTA 

TGTGGCGCGGTATTATCCCGTGTT 
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GATCCTCTACGCCGGACGCATCGTGGCCGGCCACGATGCGTCCGGCGTAGAG 

GATCTATTTAACGACCCTGCCCTGAACCGACGACCGGGTCGAATTTGCTTTC 

GAATTTCTGCCATTCATCCGCTTATTATCACTTATTCAGGCGTAGCACCAGGC 

GTTTAAGGGCACCAATAACTGCCTTAAAAAAATTACGCCCCGCCCTGCCACT 

CATCGCAGTACTGTTGTAATTCATTAAGCATTCTGCCGACATGGAAGCCATC 

ACAGACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTGTCGCCT 

TGCGTATAATATTTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGTCCATAT 

TGGCCACGTTTAAATCAAAACTGGTGAAACTCACCCAGGGATTGGCTGAGAC 

GAAAAACATATTCTCAATAAACCCTTTAGGGAAATAGGCCAGGTTTTCACCG 

TAACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGGAAATCGTCGT 

GGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATGGAAAACGGT 

GTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTCTTTCAITGCC 

ATACGGAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATOTGAATAAAG 

GCCGGATAAAACTrGTGCTTATTTITCTTTACGGTCTTTAAAAAGGCCGTAAT 

ATCCAGCTGAACGGTCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCC 

TCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTGGTATATCCAGT 

GATTTITTTCTCCATTTGCGGAGGGATATGAAAGCGGCCGCTTCCACACATTA 

AACTAGTrCGATGATTAATTGTCAACAGCTCGCCGGCGGCACCTCGCTAACG 

GATTCACCACTCCAAGAATTGGAGCCAATCGATTCTTGCGGAGAACTGTGAA 

TGCGCAAACCAACCCTTGGCAGAACATATCCATCGCGTCCGCCATCTCCAGC 

AGCCGCACGCGGCGCATCTCGGGCAGCGTTGGGTCCTGGCCACGGGTGCGCA 

TGATCGTGCTCCTGTCGTTGAGGACCCGGCTAGGCTGGCGGGGTTGCCTTAC 

TGGTTAGCAGAATGAATCACCGATACGCGAGCGAACGTGAAGCGACTGCTG 

CTGCAAAACGTCTGCGACCTGAGCAACAACATGAATGGTCTTCGGTTTCCGT 

GTTTCGTAAAGTCTGGAAACGCGGAAGTCAGCGCCCTGCACCATTATGTTCC 

GGATCTGGGTACCGAGCTCGAATTCACTGGCCGTCGTTTTACAACGTCGTGA 

CTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCdCCT 

TTCGCCAGGCATCGCAGGATGCTGCTGGCTACCCTGTGGAACACCTACATCT 

GTATTAACGAAGCGCTGGCATTGACCCTGAGTGATmTCTGTGGTCCCGCCG 

CATCCATACCGCCAGTTGTTTACCCTCACAACGTTCCAGTAACCGGGCATGTT 

CATCATCAGTAACCCGTATCGTGAGCATCCTCTCTCGTTTCATCGGTATCATT 

ACCCCCATGAACAGAAATTCCCCCTTACACGGAGGCATCAAGTGACCAAACA 

GGAAAAAACCGCCCTTAACATGGCCCGCTTTATCAGAAGCCAGACATTAACG 

CTTCTGGAGAAACTCAACGAGCTGGACGCGGATGAACAGGCAGACATCTGT 

GAATCGCTTCACGACCACGCTGATGAGCTTTACCGCAGCTGCCTCGCGCGTT 

TCGGTGATGACGGTQAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCAC 

AGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTC 

AGCGGGTGTTGGCGGGTGTCGGGGCGCAGCCATGACCCAGTCACGTAGCGA 

TAGCGGAGTGTATACTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGA 

GAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAAT 

ACCGCATCAGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC 

GTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTAT 

CCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGC 

AAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCT 

CCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCG 

AAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTC 
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GTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCT 

CCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTT 

CGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCA 

GCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTA 

AGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGA 

GCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACG 

GCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTAC 

CTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGT 

AGCGGTGGTTTrTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGAT 

CTCAAGAAGATCCTTTGATCrnTCTACGGGGTCTGACGCTCAGTGGAACGA 

AAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACC 

TAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATG 

AGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTC 

AGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGA 

TAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACC 

GCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCC 

GGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGT 

CTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTT 

GCGCAACGTTGTTGCCATTGCTGCAGGCATCGTGGTGTCACGCTCGTCGTTTG 

GTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATC 

CCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCA 

GAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAA 

TTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT 

CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCC 

GGCGTCAACACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTC 

ATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGT 

TGAGATCCAGtTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCT 

TTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCG 

CAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCT 

TTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACA 

TATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCC 

CCGAAAAGTGCCACCTGACGTGTAAGAAACCATTATTATCATGACATTAACC 

TATAAAAATAGGCGTATCACGAGGCCCrTTCGTCTTCAAGAATTCTCATGTTT 

GACAGCTTATCATCGATAAGCTTTAATGCGGTAGTTTATCACAGTTAAATTGC 

TAACGCAGTCAGGCACCGTGTATGAAATCTAACAATGCGCTCATCGTCATCC 

TCGGCACCGTCACCCTGGATGCTGTAGGCATAGGCTTGGTTATGCCGGTACT 

GCCGGGCCTCTTGCGGGATATCGTCCATTCCGACAGCATCGCCAGTCACTAT 

GGCGTGCTGCTAGCGCTATATGCGTTGATGCAATTTCTATGCGCACCCGTTCT 

CGGAGCACTGTCCGACCGCTTTGGCCGCCGCCCAGTCCTGCTCGCTTCGCTAC 

TTGGAGCCACTATCGACTACGCGATCATGGCGACCACACCCGTCCTGTGGAT 

CCCAGACGAGTTAAGTCACCATACGTtAGTACAGGTTGCCACTCTTTTGGCA 

GACGCAGACCTACGGCTACAATAGCGAAGCGGTCCTGGTATTCATGTTTAAA 

AATACTGTCGCGATAGCCAAAACGGCACTCTTTGGCAGTTAAGCGCACTTGC 

TTGCCTGTCGCCAGTTCAACAGAATCAACATAAGCGCAAACTCGCTGTAATT 

CTACGCCATAAGCACCAATATTCTGGATAGGTGATGAGCCGACACAACCAGG 

AATTAATGCCAGATTTTCCAGACCAGGCATACCTTCCTGCAAAGTGTATTTTA 
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CCAGACGATGCCAGTTTTCTCCGGCTCCTACATGTAAATACCACGCATCAGG 

TTCATCATGAATTTCGATACCTTTGATCCGGTTGATGATCACCGTGCCGCGAT 

AGTCCTCCAGAAAAAGTACATTACTTCCTTCACCCAGAATAAGAACGGGTTG 

TCCTTCTGCGGTTGCATACTGCCAGGCATTGAGTAATTGTTGTTCGTCTTCGG 

CACATACAATGTGCTGAGCATTATGATCAATGCCAAATGTGTTCCAGGGTTT 

TAAGGAGTGGTTCATAGCTGCTTTCCTGATGCAAAAACGAGGCTAGTTTACC 

GTATCTGTGGGGGGATGGCTTGTAGATATGACGACAGGAAGAGTTTGTAGAA 

ACGCAAAAAGGCCATCCGTCAGGATGGCCTTCTGCTTAATTTGATGCCTGGC 

AGTTTATGGCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTGCTTCGCAACG 

TTCAAATCCGCTCCCGGCGGATTTGTCCTACTCAGGAGAGCGTTCA CCGAC A 

AACAACAGATAAAACGAAAGGCCCAGTCTTTCGACrGAGCCTTTCGTTTTAT 

TTGATGCCTGGCAGTTCCCTACTCTCGCATGGGGAGACCCCACACTACCATC 

GGCGCTACGGCGTTTCACTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACC 

GCGCTACTGCCGCCAGGCAAATTCTGTmATCAGACCGCTTCTGCGTTCTGA 

TTTAATCTGTATCAGGCTGAAAATCTTCTCTCATCCGCCAAAACAGCTTCGGC 

GTTGTAAGGTTAAGCCTCACGGTTCATTAGTACCGGTTAGCTCAACGCATCG 

CTGCGCTTACACACCCGGCCTATCAACGTCGTCGTCTTCAACGTTCCTTCAGG 

ACC(nTAAAGGGTCAGGGAGAACTCATCTCGGGGCAAGTTTCGTGCTTAGAT 

GCTTTCAGCACTTATCTCTTCCGCATTTAGCTACCGGGCAGTGCCATTGGCAT 

GACAACCCGAACACCAGTGATGCGTCCACTCCGGTCCTCTCGTACTAGGAGC 

AGCCCCCCTCAGTTCTCCAGCGCCCACGGCAGATAGGGACCGAACTGTCTCA 

CGACGTTCTAAACCCAGCTCGCGTACCACTTTAAATGGCGAACAGCCATACC 

CTTGGGACCTACTTCAGCCCCAGGATGTGATGAGCCGACATCGAGGTGCCAA 

ACACCGCCGTCGATATGAACTCTTGGGCGGTATCAGCCTGTTATCCCCGGAG 

TACCTTTTATCCGTTGAGCGATGGCCCTTCCATTCAGAACCACCGGATCACTA 

TGACCTGCTTTCGCACCTGCTCGCGCCGTCACGCTCGCAGTCAAGCTGGCTTA 

TGCCATTGCACTAACCTCCTGATGTCCGACCAGGATTAGCCAACCTTCGTGCT 

CCTCCGTTACTCTTTAGGAGGAGACCGCCCCAGTCAAACTACCCACCAGACA 

CTGTCCGCAACCCGGATTACGGGTCAACGTTAGAACATCAAACATTAAAGGG 

TGGTATTTCAAGGTCGGCTCCATGCAGACTGGCGTCCACACTTCAAAGCCTC 

CCACCTATCCTACACATCAAGGCTCAATGTTCAGTGTCAAGCTATAGTAAAG 

GTTCACGGGGTCTTTCCGTCTTGCCGCGGGTACACTGCATCTTCACAGCGAGT 

TCAATTTCACTGAGTCTCGGGTGGAGACAGCCTGGCCATCATTACGCCATTC 

GTGCAGGTCGGAACTTACCCGACAAGGAATTTCGCTACCTTAGGACCGTTAT 

AGTTACGGCCGCCGTTTACCGGGGCTTCGATCAAGAGCTTCGCTTGCGCTAA 

CCCCATCAATTAACCTTCCGGCACCGGGCAGGCGTCACACCGTATACGTCCA 

CTTTCGTGTTTGCACAGTGCrOTGTTTTTAATAAACAGTTGCAGCCAGCTGGT 

ATCTTCGACTGATtTCAGCTCCATCCG CGAG GGACCTCACCTACATATCAGC 

GTGCCTTCTCCCGAAGTTACGGCACCATTTTGCCTAGTTCCTTCACCCGAGTT 

CTCTCAAGCGCCTTGGTATTCTCTACCTGACCACCTGTGTCGGTTTGGGGTAC 

GATTTGATGTTACCTGATGCTTAGAGGCTTTTCCTGGAAGCA GGGC ATTTGTT 

GCTTCAGCACCGTAGTGCCTCGTCATCACGCCTCAGCCTTGATnTCCGGATT 

TGCCTGGAAAACCAGCCTACACGCTTAAACCGGGACAACCGTCGGCCGGCCA 

ACATAGCCTTCTCCGTCCCCCCTTCGCAGTAACACCAAGTACAGGAATATTA 

ACCTGTTTCCCATCGACTACGCCTTTCGGCCTCGCCTTAGGGGTCGACTCACC 

CTGCCCCGATTAACGTTGGACAGGAACCCTTGGTCTTCCGGCGAGCGGGCTT 
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TTCACCCGCTTTATCGTTACTTATGTCAGCATTCGCACTTCTGATACCTCCAG 

CATGCCTCACAGCACACCTTCGCAGGCTTACAGAACGCTCCCCTACCCAACA 

ACGCATAAGCGTCGCTGCCGCAGCTTCGGTGCATGGTTTAGCCCCGTTACAT 

CTTCCGCGCAGGCCGACTCGACCAGTGAGCTATTACGCTTTCTTTAAATGATG 

GCTGCTTCTAAGCCAACATCCTGGCTGTCTGGGCCTTCCCACATCGTTTCCCA 

CTTAACCATGACTTTGGGACCTTAGCTGGCGGTCTGGGTTGTTTCCCTCTTCA 

CGACGGACGTTAGCACCCGCCGTGTGTCTCCCGTGATAACATTCTCCGGTATT 

CGCAGTTTGCATCGGGTTGGTAAGTCGGGATGACCCCCTTGCCGAAACAGTG 

CTCTACCCCCGGAGATGAATTCACGAGGCGCTACCTAAATAGCTTTCGGGGA 

GAACCAGCTATCTCCCGGTTTGATTGGCCTTTCACCCCCAGCCACAAGTCATC 

CGCTAATTTTTCAACATTAGTCGGTTCGGTCCTCCAGTTAGTGTTACCCAACC 

TTCAACCTGCCCATGGCTAGATCACCGGGTTTCGGGTCTATACCCTGCAACTT 

AACGCCCAGTTAAGACTCGGTTTCCCTTCGGCTCCCCTATTCGGTTAACCTTG 

CTACAGAATATAAGTCGCTGACCCATTATACAAAAGGTACGCAGTCACACGC 

CTAAGCGTGCTCCCACTGCTTGTACGTACACGGTITCAGGTTCTTTTTCACTC 

CCCTCGCCGGGGTTCTTTTCGCCTTTCCCTCACGGTACTGGTTCACTATCGGT 

CAGTCAGGAGTATTTAGCCTTGGAGGATGGTCCCCCCATATTCAGACAGGAT 

ACCACGTGTCCCGCCCTACTCATCGAGCTCACAGCATGTGCATTTTTGTGTAC 

GGGGCTGTCACCCTGTATCGCGCGCCTTTCCAGACGCTTCCACTAACACACA 

CACTGATTCAGGCTCTGGGCTGCTCCCCGTTCGCTCGCCGCTACTGGGGGAA 

TCTCGGTTGATTTCTTTTCCTCGGGGTACTTAGATGTTTCAGTTCCCCCGGTTC 

GCCTCATTAACCTATGGATTCAGTTAATGATAGTGTGTCGAAACACACTGGG 

TTTCCCCATTCGGAAATCGCCGGTTATAACGGTTCATATCACCTTACCGACGC 

TTATCGCAGATTAGCACGTCCTTCATCGCCTCTGACTGCCAGGGCATCCACCG 

TGTACGCTTAGTCGCTTAACCTCACAACCCGAAGATGTTTCTTTCGATTCATC 

ATCGTGTTGCGAAAATTTGAGAGACTCACGAACAACTCTCGTTGTTCAGTGT 

TTCAATTTTCAGCTTGATCCAGATTTTTAAAGAGCAAAAATCTCAAACATCAC 

CCGAAGATGAGTTTTGAGATATTAAGGTCGGCGACTTTCACTCACAAACCAG 

CAAGTGGCGTCCCCTAGGGGATTCGAACCCCTGTTACCGCCGTGAAAGGGCG 

GTGTCCTGGGCCTCTAGACGAAGGGGACACGAAAATTGCTTATCACGCGTTG 

CGTGATATTTTCGTGTAGGGTGAGCTTTCATTAATAGAAAGCGAACGGCCTT 

ATTCTCTTCAGCCTCACTCCCAACGCGTAAACGCCTTGCtTTTCACTTTCTATC 

AGACAATCTGTGTGAGCACTACAAAGTACGCTTCTTTAAGGTAATCCCATGA 

TCCAACCGCAGGTTCCCCTACGGTTACCTTGTTACGACTTCACCCCAGTCATG 

AATCACAAAGTGGTAAGCGCCCTCCCGAAGGTTAAGCTACCTACTTCTTTTG 

CAACCCACTCCCATGGTGTGACGGGCGGTGTGTACAAGGCCCGGGAACGTAT 

TCACCGTGGCATTCTGATCCACGATTACTAGCGATTCCGACTTCATGGAGTCG 

AGTTGCAGACTCCAATCCGGACTACGACGCACTTTATGAGGTCCGCTTGCTC 

TCGCGAGGTCGCTTCTCTTTGTATGCGCCATTQTAGCACGTGTGTAGCCCTGG 

TCGTAAGGGCCATGATGACTTGACGTCATCCCCACCTTCCTCCAGTTTATCAC 

TGGCAGTCTCCTTTGAGTTCCCGGCCGGACCGCTGGCAACAAAGGATAAGGG 

TTGCQCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAGCTGACGACA 

GCCATGCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAA 

ACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAA 

ACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAAC 

CTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTCCGGAAGCC 
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ACGCCTCAAGGGCACAACCTCCAAGTCGACATCGT7TACGGCGTGGACTACC 

AGGGTATCTAATCCTGTTTGCTCCCCACGCTTTCGCACCTGAGCGTCAGTCTT 

CGTCCAGGGGGCCGCCTTCGCCACCGGTATTCCTCCAGATCTCTACGCATTTC 

ACCGCTACACCTGGAATTCTACCCCCCTCTACGAGACTCAAGCTTGCCAGTA 

TCAGATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTAACAAA 

CCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCG 

TATTACCGCGGCTGCTGGCACGGAGTTAGCCGGTGCTTCTTCTGCGGGTAAC 

GTCAATGAGCAAAGGTATTAACTTTACTCCCTTCCTCCCCGCTGAAAGTACTT 

TACAACCCGAAGGCCTTCTTCATACACGCGGCATGGCTGCATCAGGCTTGCG 

CCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTC 

TCAGTTCCAGTGTGGCTGGTCATCCTCTCAGACCAGCTAGGGATCGTCGCCT 

AGGTGAGCCGTTACCCCACCTACTAGCTAATCCCATCTGGGCACATCCGATG 

GCAAGAGGCCCGAAGGTCCCCCTCTTTGGTCTTGCGACGTTATGCGGTATTA 

GCTACCGnTCCAGTAGTTATCCCCCTCCATCAGGCAGTTTCCCAGACATTAC 

TCACCCGTCCGCCACTCGTCAGCAAAGAAGCAAGCTTCTTCCTGTTACCGTTC 

GACTTGCATGTGTTAGGCCTGCCGCCAGCGTTCAATCTGAGCCATGATCAAA 

CTCTTCAATTTAAAAGTTTGACGCTCAAAGAATTAAACTTCGTAATGAATTAC 

GTGTTCACTCTTGAGACTTGGTATTCATTTTTCGTCTTGCGACGTTAAGAATC 

CGTATCTTCGAGTGCCCACACAGATTGTCTGATAAATTGTTAAAGAGCAGTG 

CCGCTTCGCTTTTTCTCAGCGGCCGCTGTGTGAAATTGTTATCCGCTCACAAT 

TCCACACATTATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAA 

TGAGTGAGCTAAfcTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTC 

GGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAG 

AGGCGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGAC 

GGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGAGAGTTGCAGCAAG 

CGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTG 

ACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGA 

GATATCCGCACCAACGCGCAGCCCGGACTCGGTAATGGCGCGCATTGCGCCC 

AGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTGGGAACGATGCCCTCAT 

TCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTCC 

CGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAG 

CCAGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGA 

TTTGCTGGTGACCCAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTC 

TTCATGGGAGAAAATAATACTGTTGATGGGTGTCTGGTCAGAGACATCAAGA 

AATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAATGGCATCCTGGT 

CATCCAGCGGATAGTTAATGATCAGCCCACTGACCCGTTGCGCGAGAAGATT 

GTGCACCGCCGCTTTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCA 

CCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTG 

CGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGA 

CTGTTTGCCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCCG 

CCATCGCCGCTTCCACTT7TTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGG 

TTCACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCTGCGACAT 

CGTATAACGTTACTGGTTTCACATTCACCACCCTGAATTGACTCTCTTCCGGG 

CGCTATCATGCCATACCGCGAAAGGTTTrGCACCATTCGATGGTGTCG 
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AAATTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAAC 

ACATGCAAGTCGAACGGTAACAGGAAGAAGCITGCTTCTTTGCTGACGAGTG 

GCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTA 

CTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGAC 

CTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGG 

GGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAG 

CCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGG 

GAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAG 

AAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTT 

AATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGT 

GCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGG 

GCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCT 

CAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGQG 

TAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGG 

TGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTG 

GGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCG 

ACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTC 

GACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGG 

GGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAA 

CCTTACCTGGTC 

TTGACATCCACGGAAGTTTTCAGAGATGAGAATGTGCCTTCGGGAACCGTGA 

GACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAG 

TCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGGTCCGGCCGGGAA 

CTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAA 

GTCATCATGGCCCTTACGACCAGGGCTACACACGTGCTACAATGGCGCATAC 

AAAGAGAAGCGACCTCGCGAGAGCAAGCGGACCTCATAAAGTGCGTCGTAG 

TCCGGATTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATC 

GTGGATCAGAATGCCACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCC 

GTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGG 

AGGGCGCTTACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTA 

ACCGTAGGGGAACCTGCGGTTGGATCATGGGATTACCTTAAAGAAGCGTACT 

TTGTAGTGCTCACACAGATTGTCTGATAGAAAGTGAAAAGCAAGGCGTTTAC. 

GCGTTGGGAGTGAGGCTGAAGAGAATAAGGCCGTTCGCTTTCT ATTAA TGAA 

AGCTCACCCTACACGAAAATATCACGCAACGCGTGATAAGCAATTTTCGTGT 

CCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGT 

TCGAATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCGCCGAC 

CTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATTTTTGCTCTTTAA 

AAATCTGGATCAAGCTGAAAATTGAAACACTGAACAACGAGAGTTGTTCGTG 

AGTCTCTCAAATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGGG 

TTGT 

GAGGTTAAGCGACTAAGCGTACACGGTGGATGCCCTGGCAGTCAGAGGCGA 
TGAAGGACGTGCTAATCTGCGATAAGCGTCGGTAAGGTGATATGAACCGTTA 
TAACCGGCGATTTCCGAATGGGGAAACCCAGTGTGTTTCGACACACTATCAT 
TAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGAACTGAAACATCTAA 
GTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGA 
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ACGGGGAGCAGCCCAGAGCCTGAATCAGTGTGTGTGTTAGTGGAAGCGTCTO 

GAAAGGCGCGCGATACAGGGTGACAGCCCCGTACACAAAAATGCACATGCT 

GTGAGCTCGATGAGTAGGGCGGGACACGTGGTATCCTGTCTGAATATGGGGG 

GACCATCCTCCAAGGCTAAATACTCCTGACTGACCGATAGTGAACCAGTACC 

GTGAGGGAAAGGCGAAAAGAACCCCGGCGAGGGGAGTGAAAAAGAACCTG 

AAACCGTGTACGTACAAGCAGTGGGAGCACGCTTAGGCGTGTGACTGCGTAC 

CTTTTGTATAATGGGTCAGCGACTTATATTCTGTAGCAAGGTTAACCGAATA 

GGGGAGCCGAAGGGAAACCGAGTCTTAACTGGGCGTTAAGTTGCAGGGTAT 

AGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTGAAGGTTGGGTAACA 

CTAACTGGAGGACCGAACCGACTAATGTTGAAAAATTAGCGGATGACTTGTG 

GCTGGGGGTGAAAGGCCAATCAAACCGGGAGATAGCTGGTTCTCCCCGAAA 

GCTATTTAGGTAGCGCCTCGTGAATTCATCTCCGGGGGTAGAGCACTGTTTC 

GGCAAGGGGGTCATCCCGACTTACCAACCCGATGCAAACTGCGAATACCGG 

AGAATGTTATCACGGGAGACACACGGCGGGTGCTAACGTCCGTCGTGAAGA 

GGGAAACAACCCA 

GACCGCCAGCTAAGGTCCCAAAGTCATGGTTAAGTGGGAAACGATGTGGGA 

AGGCCCAGACAGCCAGGATGTTGGCTTAGAAGCAGCCATCATTTAAAGAAA 

GCGTAATAGCTCACTGGTCGAGTCGGCCTGCGCGGAAGATGTAACGGGGCTA 

AACCATGCACCGAAGCTGCGGCAGCGACGCTTATGCGTTGTTGGGTAGGGGA 

GCGTTCTGTAAGCCTGCGAAGGTGTGCTGTGAGGCATGCTGGAGGTATCAGA 

AGTGCGAATGCTOACATAAGTAACGATAAAGCGGGTGAAAAGCCCGCTCGC 

CGGAAGACCAAGGGTTCCTGTCCAACGTTAATCGGGGCAGGGtGAGTCGAC 

CCCTAAGGCGAGGCCGAAAGGCGTAGTCGATGGGAAACAGGTTAATATTCC 

TGTACTTGGTGTTACTGCGAAGGGGGGACGGAGAAGGCTATGTTGGCCGGGC 

GACGGTTGTCCCGGTTTAAGCGTGTAGGCTGGTTTTCCAGGCAAATCCGGAA 

AATCAAGGCTGAGGCGTGATGACGAGGCACTACGGTGCTGAAGCAACAAAT 

GCCCTGCTTCCAGGAAAAGCCTCTAAGCATCAGGTAACATCAAATCGTACCC 

CAAACCGACACAGGTGGTCAGGTAGAGAATACCAAGGGGCTTGAGAGAACT 

CGGGTGAAGGAACTAGGCAAAATGGTGCCGTAACTTCGGGAGAAGGCACGC 

TGATATGTAGGTGAGGTCCCTCGCGGATGGAGCTGAAATCAGTCGAAGATAC 

CAGCTGGCTGCAACTGTTTATTAAAAACACAGCACTGTGCAAACACGAAAGT 

GGACGTATACGGTGTGACGCCTGCCCGGTGCCGGAAGGTTAATTGATGGGGT 

TAGCGCAAGCGAAGCTCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACT 

ATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGCAC 

GAATGGCGTAA 

TGATGGCCAGGCTGTCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGA 

AGATGCAGTGTACCCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATA 

GCTTGACACTGAACATTGAGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGA 

AGTGTGGACGCCAGTCTGCATGGAGCCGACCTTGAAATACCACCCTTTAATG 

TTTGATGTTCTAACGTTGACCCGTAATCCGGGTTGCGGACAGTGTCTGGTGG 

GTAGTTTGACTGGGGCGGTCTCCTCCTAAAGAGTAACGGAGGAGCACGAAG 

GTTGGCTAATCCTGGTCGGACATCAGGAGGTTAGTGCAATGGCATAAGCCAG 

CTTGACTGCGAGCGTGACGGCGCGAGCAGGTGCGAAAGCAGGTCATAGTGA 

TCCGGTGGTTCTGAATGGAAGGGCCATCGCTCAACGGATAAAAGGTACTCCG 

GGGATAACAGGCTGATACCGCCCAAGAGTTCATATCGACGGCGGTGTTTGGC 

Fig. 24 
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ACCTCGATGTCGGCTCATCACATCCTGGGGCTGAAGTAGGTCCCAAGGGTAT 

GGCTGTTCGCCATTTAAAGTGGTACGCGAGCTGGGTTTAGAACGTCGTGAGA 

CAGTTCGGTCCCTATCTGCCGTGGGCGCTGGAGAACTGAGGGGGGCTGCTCC 

TAGTACGAGAGGACCGGAGTGGACGCATCACTGGTGTTCGGGTTGTCATGCC 

AATGGCACTGCCCGGTAGCTAAATGCGGAAGAGATAAGTGCTGAAAGCATC 

TAAGCACGAAACTTGCCCCGAGATGAGTTCTCCCTGACCCTTTAAGGGTCCT 

GAAGGAACGTTGAAGACGACGACGTTGATAGGCCGGGTGTGTAAGCGCAGC 

GATGCGTTGAGCTAACCGGTACTAATGAACCGTGAGGCTTAACCTTACAACG 

CCGAAGCTGTTTTGGCGGATGAGAGAAGATTTTCAGCCTGATACAGATTAAA 

TCAGAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGC 

GCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCG 

CCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGA 

ACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTT 

CGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCC 

GGGAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGGGCAGG 

ACGCCCG CCATAA ACTGCCAGGCATCAAATTAAGCAGAAGGCCATCCTGAC 

ATCCCCCCACAGATACGGTAAACTAGCCTCGTmTGCATCAGGAAAGCAGC 

TATGAACCACTCCTTAAAACCCTGGAACACATTTGGCATTGATCATAATGCT 

CAGCACATTGTATGGGCCTTAAGGGCCCAACAATTACtCAATGCCTGGCAGT 

ATGCAACCGCAGAAGGACAACCCGTTCTTAITCTGGGTGAAGGAAGTAATGT 

ACTTTTTCTGGAGGACTATCGCGGCACGGTGATCATCAACCGGATCAAAGGT 

ATCGAAATTCATGATGAACCTGATGCGTGGTATTTACATGTAGGAGCCGGAG 

AAAACTGGCATCGTCTGGTAAAATACACTTTGCAGGAAGGTATGCCTGGTCT 

GGAAAATCTGGCATTAATTCCTGGTTGTGTCGGCTCATCACCTATCCAGAAT 

ATTGGTGCTTATGGCGTAGAATTACAGCGAGTTTGCGCTTATGTTGATTCTGT 

TGAACTGGCGACAGGCAAGCAAGTGCGCTTAACTGCCAAAGAGTGCCGTTTT 

GGCTATCGCGACAGTATTTTTAAACATGAATACCAGGACCGCTTCGCTATTG 

TAGCCGTAGGTCTGCGTCTGCCAAAAGAGTGGCAACCTGTACTAACGTATGG 

TGACTTAACTCGTCTGGGATCCACAGGACGGGTGTGGTCGCCATGATCGCGT 

AGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGACTGGGCGGCGGCCAA 

AGO 

GGTCGGACAGTGCTCCGAGAACGGGTGCGCATAGAAATTGCATCAACGCAT 

ATAGCGCTAGCAGCACGCCATAGTGACTGGCGATGCTGTCGGAATGGACGAT 

ATCCCGCAAGAGGCCCGGCAGTACCGGCATAACCAAGCCTATGCCTACAGC 

ATCCAGGGTGACGGTGCCGAGGATGACGATGAGCGCATTGTTAGATTTCATA 

CACGGTGCCTGACTGCGTTAGCAATTTAACTGTGATAAACTACCGCATTAAA 

GCTTATCGATGATAAGCTGTCAAACATGAGAATTCTTGAAGACGAAAGGGCC 

TCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAG 

ACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATT 

TTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAA 

ATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTG 

TCGCCCTTATTCCCTTTmGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAG 

AAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGG 

GTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCC 
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CGAAGAACGTTrrCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCG 

GTATTATCCCGTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACT 

ATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTAC 

GGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGAT 

AACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTA 

ACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGG 

AA 

CCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCT 

GCAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTC 

TAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAG 

GACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCT 

GGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATG 

GTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTAT 

GGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCA 

TTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAAC 

TTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATG 

ACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAG 

AAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGC 

TTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAG 

AGCTACCAACTCnTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACC 

AAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCT 

GTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGC 

CAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCG 

GATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGC 

TTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGA 

GAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGC 

GGC 

AGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAAC GCCTGG 

TATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGAl 11 1 1 

GTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGC 

CTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTG^ 

GTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATA 

CCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAG 

CGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCA 

CACCGCATATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAA 

GCCAGTATACACTCCGCTATCGCTACGTGACTGGGTCATGGCTGCGCCCCGA 

CACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCAT 

CCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTT 

TTCACCGTCATCACCGAAACGCGCGAGGCAGCTGCGGTAAAGCTCATCAGCG 

TGGTCGTGAAGCGATTCACAGATGTCTGCCTGTTCATCCGCGTCCAGCTCGTT 

GAGTTTCTCCAGAAGCGTTAATGTCTGGCrrCTGATAAAGCGGGCCATGTTA 

AGGGCGGTTTTTTCCTGTTTGGTCACTTGATGCCTCCGTGTAAGGGGGAATTT 

CTGTTCATGGGGGTAATGATACCGATGAAACGAGAGAGGATGCTCACGATA 

CGGGTTACTGATGATGAACATGCCCGGTTACTGGAACGTTGTGAGGGTAAAC 

AACTGGCGGTATGGATGCGGCGGGACCAGAGAAAAATCACTCAGGGTCAAT 




Cont. 



SUBSTITUTE SHEET (RULE 26) 



wo 2004/003511 



PCT/US2003/020963 



36/47 



GCCAGCGCTTCGTTAATACAGATGTAGGTOTTCCACAGGGTAGCCAGCAGCA 
TCCTGCGATGCCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGT 
AACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTC 
GAGCTCGGTACCTGCACTGACGACAGGAAGAG 

TTTGTAGAAACGCAAAAAGGCCATCCGTCAGGATGGCCTTCTGCTTAATTTG 

ATGCCTGGCAGTTTATGGCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTGC 

TTCGCAACGTTCAAATCCGCTCCCGGCGGATTTGTCCTACTCAGGAGAGCGT 

TCACCGACAAACAACAGATAAAACGAAAGGCCCAGTCTTTCGACTGAGCCTT 

TCGTTTTATTTGATGCCTGGCAGTTCCCTACTCTCGCATGGGGAGACCCCACA 

CTACCATCGGCGCTACGACTAGATTATTTGTAGAGCTCATCCATGCCATGTGT 

AATCCCAGCAGCAGTTACAAACTCAAGAAGGACCATGTGGTCACGCTTTTCG 

TTGGGATCTTTCGAAAGGGCAGATTGTGTCGACAGGTAATGGTTGTCTGGTA 

AAAGGACAGGGCCATCGCCAATTGGAGTATTTTGTTGATAATGGTCTGCTAG 

TTGAACGGATCCATCTTCAATGTTGTGGCGAATTTTGAAGTTAGCTTTGATTC 

CATTCTTTTGTTTGTCTGCCGTGATGTATACATTGTGTGAGTTATAGTTGTACT 

CGAGTTTGTGTCCGAGAATGTTTCCATCTTCTTTAAAATCAATACCTTTTAAC 

TCGATACGATTAACAAGGGTATCACCTTCAAACTTGACTTCAGCACGCGTCT 

TGTAGTTCCCGTCATCTTTGAAAGATATAGTGCGTTCCTGTACATAACCTTCG 

GGCATGGCACTCTTGAAAAAGTCATGCCGTTTCATATGATCCGGATAACGGG 

AAAAGCATTGAACACCATAAGAGAAAGTAGTGACAAGTGTTGGCCATGGAA 

CAGGTAGTTTTCCAGTAGTGCAAATAAATTTAAGGGTAAGCTTTCCGTATGT 

AGCATCACCTTCACCCtCTCCACTGACAGAAAATTTGTGCCCATTAACATCAC 

CATCTAATTCAACAAGAATTGGGACAACTCCAGTGAAAAGTTCTTCT 

CCTTTGCTCGCAGTGATTTTnrCTCCATTTGCGGAGGGATATGAAAGCGGCC 

GCTTCCACACATTAAACTAGTTCGATGATTAATTGTCAACAGCTCGCCGGCG 

GCACCTCGCTAACGGATTCACCACTCCAAGAATTGGAGCCAATCGATTCTTG 

CGGAGAACTGTGAATGCGGGTACCCAGATCCGGAACATAATGGTGCAGGGC 

GCTGACTTCCGCGTTTCCAGACTTTACGAAACACGGAAACCGAAGACCATTC 

ATGTTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTCACGTTCGC 

TCGCGTATCGGTGATTCATTCTGCTAACCAGTAAGGCAACCCCGCCAGCCTA 

GCCGGGTCCTCAACGACAGGAGCACGATCATGCGCACCCGTGGCCAGGACC 

CAACGCTGCCCGAGATGCGCCGCGTGCGGCTGCTQGAGATGGCGGACGCGA 

TGGATATGTTCTGCCAAGGGTTGGTTTGCGCATTCACAGTTCTCCGCAAGAAT 

CGATTGGCTCCAATTCTTGGAGTGGTGAATCCGTTAGCGAGGTGCCGCCGGC 

GAGCTGTTGACAATTAATCATCGAACTAGTTTAATGTGTGGAAGCGGCCGCT 

TTCATATCCCTCCGCAAATGGAGAAAAAAATCACTGGATATACCACCGTTGA 

TATATCCCAATGGCATCGTAAAGAACATTTTGAGGCATTTCAGTCAGTTGCTC 

AATGTACCTATAACCAGACCGTTCAGCtGGATATTACGGCCTTTTTAAAGAC 

CGTAAAGAAAAATAAGCACAAGTTTTATCCGGCCTTTATrCACATTCTTGCCC 

GCCTGATGAATGCTCATCCGGAATTCCGTATGGCAATGAAAGACGGTGAGCT 

GGTGATATGGGATAGTGTTCACCCTTGTTACACCGTTTTCCATGAGCAAACtG 

AAACGTTTTCATCGCTCTGGAGTGAATACCACGACGATTTCCGGCAGTTTC 

TACACATATATTCGCAAGATGTGGCGTGTTACGGTGAAAACCTGGCCTATTT 

CCCTAAAGGGTTTATTGAGAATATGTTTTTCGTCTCAGCCAATCCCTGGGTGA 

GTrrCACCAGTTTTGATTTAAACGTGGCCAATATGGACAACTTCTTCGCCCCC 
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GTTTTCACCATGGGCAAATATTATACGCAAGGCGACAAGGTGCTGATGCCGC 

TGGCGATTCAGGTTCATCATGCCGTCTGTGATGGCTTCCATGTCGGCAGAAT 

GCTTAATGAATTACAACAGTACTGCGATGAGTGGCAGGGCGGGGCGTAATTT 

TTTTAAGGCAGTTATTGGTGCCCTTAAACGCCTGGTGCTACGCCTGAATAAGT 

GATAATAAGCGGATGAATGGCAGAAATTCGAAAGCAAATTCGACCCGGTCG 

TCGGTTCAGGGCAGGGTCGTTAAATAGCCGCTTATGTCTATTGCTGGTTTACG 

GTTTATTGACTACCCGAAGCAGTGTGACCCTGTGC1TCTCAAATGCCTGAGG 

GCAGTTTGCTCAGGTCTCCCGTGGGGGGGAATAA1TAACGGTATGAGCCTTA 

CGGCGGACGGATCGTGGCCGCAAGTGGGTCCGGCTAGAGGATCCGACACCA 

TCGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGA 

GTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGTTATACGATGTCGCAGA 

GTATGCCGGTGTCTCTTATCAGACCGTTTCCCGCGTGGTGAACCAGGCCAGC 

CACGTTTCTGCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGGCGGAGCTG 

AATTACATTCCCAACCGCGTGGCACAACAACTGGCGGGCAAACAGTCGTTGC 

TGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCCGTCGCAAATTGTC 

GCGGCGATTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGTGGTGTCGA 



ggtagaacgaagcggcgtcgaagcctgtaaagcggcggtgcacaatcttct 
cgcgcaacgggtcagtgggctgatcattaactatccgctggatgaccaggat 
gccattgctgtggaagctgcctgcactaatgttccggcgttatttcttgatgt 
ctctgaccagacacccatcaacagtattamrctcccatgaagacggtacg 
cgactgggcgtggagcatctggtcgcattgggtcaccagcaaatcgcgctgt 
tagcgggcccattaagttgtgtctcggcgcgtctgcgtctggctggctggcat 
aaatatctcactcgcaatcaaattcagccgatagcggaacgggaaggcgac 
tggagtgccatgtccggttttcaacaaaccatgcaaatgctgaatgagggca 
tcgttcccactgcgatgctggttgccaacgatcagatggcgctgggcgcaat 
gcgcgccattaccgagtccgggctgcgcgttggtgcggatatctcggtagtg 
ggatacgacgataccgaagacagctcatgttatatcccgccgtcaaccacca 
tcaaacaggattttcgcctgctggggcaaaccagcgcggaccgcttgctgca 
actctctcagggccaggcggtgaagggcaatcagctgttgcccgtctcactg 
gtgaaaagaaaaaccaccctggcgcccaatacgcaaaccgcctctccccgc 
gcgttggccgattcattaatgcagctggcacgacaggtttcccgactggaaa 
gcgggcagtgagcgcaacgcaattaatgtga:gttagctcactcattaggcac 
cccaggctttacactttatgcttccggctcgtataatgtgtggaattgtgagc 
ggataacaatttcacacagcggccgctgagaaaaagcgaagcggcactgct 
ctttaacaatttatcagacaatctgtgtgggcactcgaagatacggattctt 
aacgtcgcaagacgaaaaatgaataccaagtctcaagagtgaacacgtaat 
tcattacgaagtttaattctttgagcgtcaaacrntaacgacggccagtga 
attcgagctcggtacctgcactgacgacaggaagag 
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AAATTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAAC 

ACATGCAAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTGACGAGTG 

GCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTA 

CTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGAC 

CTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGG 

GGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAG 

CCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGG 

GAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAG 

AAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTT 

AATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGT 

GCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGG 

GCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCT 

CAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGG 

TAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGG 

TGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTG 

GGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCG 

ACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTC 

GACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGG 

GGCCCGCACAAGCGGCGGAGCATGTGGATTAATTCGATGCAACGCGAAGAA 

CCTTACCTGGGTTTGACATGCACAGGACGCGTCTAGAGATAGQCGTTCCCtT 

GTGGCCTGTGTGCAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATG 

TTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTCTCATGTTGCCAGCACGT 

AATGGTGGGGACTCGTGAGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGG 

GATGACGTCAAGTCATCATGCCCCTTATGTCCAGGGCTTCACACATGCTACA 

ATGGCCGGTACAAAGGGCTGCGATGCCGCGAGGTTAAGCGAATCCTTAAAA 

GCCGGTCTCAGTTCdGATCGGGGTCTGCAACTCGACCCCGTGAAGTCGGAGT 

CGCTAGTAATCGCAGATCAGCAACGCTGCGGTGAATACGTTCCCGGGCCTTG 

TACACACCGCCCGTCACGTCATGAAAGTCGGTAACACCCGAAGCCAGTGGCC 

TAACCCTCGGGAGGGAGCTGTCGAAGGTGGGATCGGCGATTGGGACGAAGT 

CGTAACAAGGTAACCGTAGGGGAACCTGCGGTTGGATCATGGGATTACCTTA 

AAGAAGCGTACTTTGTAGTGCTCACACAGATTGTCTGATAGAAAGTGAAAAG 

CAAGGCGnTACGCGTTGGGAGTGAGGCTGAAGAGAATAAGGCCGTTCGCTT 

TCTATTAATGAAAGCTCACCCTACACGAAAATATCACGCAACGCGTGATAAG 

CAATrrTCGTGTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCG 

GTAACAGGGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGA 

AAGTCGCCGACCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATTT 

TTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAACGAG 

AGTTGTTCGTGAGTCTCTCAAATTTTCGCAACACGATGATGAATCGAAAGAA 

ACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGATGCCCTG 

GCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCGATAAGCGTCGGTAAGG 

TGATATGAACCGTTATAACCGGCGATTTCCGAATGGGGAAACCCAGTGTGTT 

TCGACACACTATCATTAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGA 

ACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCA 

GTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATCAGTGTGTGTGT 

TAGTGGAAGCGTCTGGAAAGGCGCGCGATACAGGGTGACAGCCCCGTACAC 
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AAAAATGCACATGCTGTGAGCTCGATGAGTAGGGCGGGACACGTGGTATCCT 

GTCTGAATATGGGGGGACCATCCTCCAAGGCTAAATACTCCTGACTGACCGA 

TAGTGAACCAGTACCGTGAGGGAAAGGCGAAAAGAACCCCGGCGAGGGGA 

GTGAAAAAGAACCTGAAACCGTGTACGTACAAGCAGTGGGAGCACGCTTAG 

GCGTGTGACTGCGTACCITTTGTATAATGGGTCAGCGACTTATATTCTGTAGC 

AAGGTTAACCGAATAGGGGAGCCGAAGGGAAACCGAGTCTTAACTGGGCGT 

TAAGTTGCAGGGTATAGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTG 

AAGGTTGGGTAACACTAACTGGAGGACCGAACCGACTAATGTTGAAAAATT 

AGCGGATGACTTGTGGCTGGGGGTGAAAGGCCAATCAAACCGGGAGATAGC 

TGGTTCTCCCCGAAAGCTATTTAGGTAGCGCCTCGtGAATTCATCTCCGGGG 

GTAGAGCACTG-ITTCGGCAAGGGGGTCATCCCGACTTACCAACCCGATGCAA 

ACTGCGAATACCGGAGAATGTTATCACGGGAGACACACGGCGGGTGCTAAC 

GTCCGTCGTGAAGAGGGAAACAACCCAGACCGCCAGCTAAGGTCCCAAAGT 

CATGGTTAAGTGGGAAACGATGTGGGAAGGCCCAGACAGCCAGGATGTTGG 

CTTAGAAGCAGCCATCATTTAAAGAAAGCGTAATApCTCACTGGTCGAGTCG 

GCCTGCGCGGAAGATGTAACGtjGGCTAAACCATGCACCGAAGCTGCGGCAG 

CGACGCTTATGCGTTGTTGGGTAGGGGAGCG1TCTGTAAGCCTGCGAAGGTG 

TGCTGTGAGGCATGCTGGAGGTATCAGAAGTGCGAATGCTGACATAAGTAAC 

GATAAAGCGGGTGAAAAGCCCGCTCGCCGGAAGACCAAGGGTTCCTGTCCA 

acgttaatcggggcagggtgagtcgacccctaaggcgaggccgaaaggcgt 

AGTCGATGGGAAACAGGTTAATATTCCTGTACTTGGTGTTACTGCGAAGGGG 

ggacggagaaggctatgttggccgggcgacggttgtcccggtttaagcgtgt 

AGGCTGGTTTTCCAGGCAAATCCGGAAAATCAAGGCTGAGGCGTGATGACG 

aggcactacggtgctgaagcaacaaatgccctgcttccaggaaaagcctcta 

AGCATCAGGTAACATCAAATCGTACCCCAAACCGACACAGGTGGTCAGGTA 

gagaataccaaggcgcttgagagaactcgggtgaaggaactaggcaaaatg 
gtgccgtaacttcgggagaaggcacgctgatatgtaggtgaggtccctcgcg 
gatggagctgaaatcagtcgaagataccagctggctgcaactgtttattaaa 

AACACAGCACTGTGCAAACACGAAAGTGGACGTATACGGTGTGACGCCTGC 

CCGGTGCCGGAAGGTTAATfGATGGGGTTAGCGCAAGCGAAGCTCTTGATCG 

AAGCCCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAQGTAGCGAAAT 

TCCTTGTCGGGTAAGTTCCGACCTGCACGAATGGCGJAATGATGGCCAGGCT 

GTCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTAC 

CCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACTGAA 

CATTGAGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGAAGTGTGGACGCCA 

GTCTGCATGGAGCCGACCTTGAAATACCACCCTTTAATGTTTGATGTTCTAAC 

GTTGACCCGTAATCCGGGTTGCGGACAGTGTCTGGTGGGTAGTTTGACTGGG 

GCGGTCTCCTCCTAAAGAGTAACGGAGGAGCACGAAGGTTGGCTAATCCTGG 

TCGGACATCAGGAGGTTAGTGCAATGGCATAAGCCAGCTTGACTGCGAGCGT 

GACGGCGCGAGCAGGTGCGAAAGCAGGTCATAGTGATCCGGTGGTTCTGAA 

TGGAAGGGCCATCGCTCAACGGAtAAAAGGTACTCCGGGGATAACAGGCTG 

ATACCGCCCAAGAGTTCATATCGACGGCGGTGTTTGGCACCTCGATGTCGGC 

TCATCACATCCTGGGGCTGAAGTAGGTCCCAAGGGTATGGCTGTTCGCCATT 

TAAAGTGGTACGCGAGCTGGGTTTAGAACGTCGTGAGACAGTTCGGTCCCTA 

TCTGCCGTGGGCGCTGGAGAACTGAGGGGGGCTGCTCCTAGTACGAGAGGA 

Fig. 25 
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CCGGAGTGGACGCATCACTGGTGTTCGGGTTGTCATGCCAATGGCACTGCCC 

GGTAGCTAAATGCGGAAGAGATAAGTGCTGAAAGCATCTAAGCACGAAACT 

TGCCCCGAGATGAGTTCTCCCTGACCCTTTAAGGGTCCTGAAGGAACGTTGA 

AGACGACGACGTTGATAGGCCGGGTGTGTAAGCGCAGCGATGCGTTG AGCT 

AACCGGTACTAATGAACCGTGAGGCTTAACCTTACAACGCCGAAGCTGTTTT 

GGCGGATGAGAGAAGATTTTCAGCCTGATACAGATTAAATCAGAACGCAGA 

AGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCAC 

CTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGT 

GGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAA 

AGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAAC 

GCTCTCCTGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGC 

AACGGCCCGGAGGGTGGCGGGCAGGACGCCCGCCATAAACTGCCAGGCATC 

AAATTAAGCAGAAGGCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAAC 

TCTTCCTGTCGTCATATCTACAAGCCATCCCCCCACAGATACGGTAAACTAGC 

CTCGTTTTTGCATCAGGAAAGCAGCTATGAACCACTCCTTAAAACCCTGGAA 

CACATTTGGCATTGATCATAATGCTCAGCACATTGTATGGGCCTTAAGGGCC 

CAACAATTACTCAATGCCTGGCAGTATGCAACCGCAGAAGGACAACCCGTTC 

TTATTCTGGGTGAAGGAAGTAATGTACTTTTTCTGGAGGACTATCGCGGCAC 

GGTGATCATCAACCGGATCAAAGGTATCGAAATTCATGATGAACCTGATGCG 

TGGTATTTACATGTAGGAGCCGGAGAAAACTGGCATCGTCTGGTAAAATACA 

CTTTGCAGGAAGGTATGCCTGGTCTGGAAAATCTGGCATTAATTCCTGGTTGT 

GTCGGCTCATCACCTATCCAGAATATTGGTGCTTATGGCGTAGAATTACAGC 

GAGTTTGCGCTTATGTTGATTCTGTTGAACTGGCGACAGGC AAGCA AGTGCG 

CTrAACTGCCAAAGAGTGCCGTTTTGGCTATCGCGACAGTATTTTTAAACATG 

AATACCAGGACCGCTTCGCTATTGTAGCCGTAGGTCTGCGTCTGCCAAAAGA 

GTGGCAACCTGTACTAACGTATGGTGACTTAACTCGTCTGGGATCCACAGGA 

CGGGTGTGGTCGCCATGATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGC 

GAGCAGGACTGGGCGGCGGCCAAAGCGGTCGGACAGTGCTCCGAGAACGGG 

TGCGCATAGAAATTGCATCAACGCATATAGCGCTAGCAGCACGCCATAGTGA 

CTGGCGATGCTGTCGGAATGGACGATATCCCGCAAGAGGCCCGGCAGTACC 

GGCATAACCAAGCCTATGCCTACAGCATCCAGGGTGACGGTGCCGAGGATG 

ACGATGAGCGCATTGTTAGATTTCATACACGGTGCCTGACTGCGTTAGCAAT 

TTAACTGTGATAAACTACCGCATTAAAGCTTATCGATGATAAGCTGTCAAAC 

ATGAGAATTCTTGAAGACGAAAGGGCCTCGTGATACGCCTA TTTTT ATAGGT 

TAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGA 

AATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTA 

TCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGG 

AAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCC 1 1 1 1 1 1 GCGGC 

ATmGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATG 

CTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAG 

CGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGC 

ACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCA 

AGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTAC 

TCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTAT 

GCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGAC 
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AACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGA 

TCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCA 

AACGACGAGCGTGACACCACGATGCCTGCAGCAATGGCAACAACGTTGCGC 

AAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAG 

ACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCC 

GGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGC 

GGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTA 

TCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCG 

CTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTA 

CTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCT 

AGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTT 

TCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAG 

ATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTA 

CCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGT 

AACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCG 

TAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCT 

GCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACC 

GGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGA 

ACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCQAA 

CTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGG 

AGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCG 

CACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGG 

TTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCG 

GAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCrTT 

TGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGAT 

AACCGTATTACCGCCTTTGAGTGAGGTGATACCGCTCGeCGCAGCCGAACGA 

CCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCTGATGCGGT 

ATTTTCTCCTTACGCATCTGTGCGGTATTrCACACCGCATATGGTGCACTCTC 

AGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATC 

GCTACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCGCTGAC 

GCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGA 

CCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACG 

CGCGAGGCAGCTGCGGTAAAGCTCATCAGCGTGGTCGTGAAGCGATTCACA 

GATGTCTGCCTGTTCATCCGCGTCCAGCTCGTTGAGTTTC TCCAGA AGCGTTA 

ATGTCTGGCTTCTGATAAAGCGGGCCATGTTAAGGGCGGTTTTTTCCTGTTTG 

GTCACTTGATGCCTCCGTGTAAGGGGGAATTTCTGTTCATGGGGGTAATGAT 

ACCGATGAAACGAGAGAGGATGCTCACGATACGGGTTACTGATGATGAACA 

TGCCCGGTTACTGGAACGTTGTGAGGGTAAACAACTGGCGGTATGGATGCGG 

CGGGACCAGAGAAAAATCACTCAGGGTCAATGCCAGCGCTTCGTTAATACA 

GATGTAGGTGTTCCACAGGGTAGCCAGCAGCATCCTGCGATGCCTGGCGAAA 

GGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGT 

CACGACGTTGTAAAACGACGGCCAGTGAATTCGAGCTCGGTACCTGCACTGA 

CGACAGGAAGAGTTTGTAGAAACGCAAAAAGGCCATCCGTCAGGATGGCCT 

TCTGCTTAATTTGATGCCTGGCAGTTTATGGCGGGCGTCCTGCCCGCCACCCT 

CCGGGCCGTTGCTTCGCAACGTTCAAATCCGCTCCCGGCGGATTTGTCCTACT 
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CAGGAGAGCGTTCACCGACAAACAACAGATAAAACGAAAGGCCCAGTCTTT 

CGACTGAGCCTTTCGTTTTATTTGATGCCTGGCAGTTCCCTACTCTCGCATGG 

GGAGACCCCACACTACCATCGGCGCTACGACTAGATTATTTGTAGAGCTCAT 

CCATGCCATGTGTAATCCCAGCAGCAGTTACAAACTCAAGAAGGACCATGTG 

GTCACGCTTTTCGTTGGGATCTTTCGAAAGGGCAGATTGTGTCGACAGGTAA 

TGGTTGTCTGGTAAAAGGACAGGGCCATCGCCAATTGGAGTATTTTGTTGAT 

AATGGTCTGCTAGTTGAACGGATCCATCTTCAATGTTGTGGCGAATTTTGAA 

GTTAGCTTTGATTCCATTCnTTTGTTTGTCTGCCGTGATGTATACATTGTGTGA 

GTTATAGTTGTACTCGAGTTTGTGTCCGAGAATGTTTCCATCTTCTTTAAAAT 

CAATACCTTTTAACTCGATACGATTAACAAGGGTATCACCTTCAAACTTGACT 

TCAGCACGCGTCTTGTAGTTCCCGTCATCTTTGAAAGATATAGTGCGTTCCTG 

TACATAACCTTCGGGCATGGCACTCTTGAAAAAGTCATGCCGTTTCATATGA 

TCCGGATAACGGGAAAAGCATTGAACACCATAAGAGAAAGTAGTGACAAGT 

GTTGGCCATGGAACAGGTAGTTTrCCAGTAGTGCAAATAAATTTAAGGGTAA 

GCTTTCCGTATGTAGCATCACCTTCACCCTCTCCACTGACAGAAAATTTGTGC 

CCATTAACATCACCATCTAATTCAACAAGAATTGGGACAACTCCAGTGAAAA 

GTTCTTCTCCTTTGCTCGCAGTGATTTTTTTCTCCATTTGCGGAGGGATATGA 

AAGCGGCCGCTTCCACACATTAAACTAGTTCGATGATTAATTGTCAACAGCT 

CGCCGGCGGCACCTCGCTAACGGATTCACCACTCCAAGAATTGGAGCCAATC 

GATTCTTGCGGAGAACTGTGAATGCGGGTACCCAGATCCGGAACATAATGGT 

GCAGGGCGCTGACITCCGCGTTTCCAGACTTTACGAAACACGGAAACCGAAG 

ACCATTCATGTTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTCA 

CGTTCGCTCGCGTATCGGTGATTCATTCTGCTAACCAGTAAGGCAACCCCGC 

CAGCCtAGCCGGGTCCTCAACGACAGGAGCACGATCATGCGCACCCGTGGCC 

AGGACCCAACGCTGCCCGAGATGCGCCGCGTGCGGCTGCTGGAGATGGCGG 

ACGCGATGGATATGTTCTGCCAAGGGTTGGTTTGCGCATTCACAGTTCTCCGC 

AAGAATCGATTGGCTCCAATTCTTGGAGTGGTGAATCCGTTAGCGAGGTGCC 

GCCGGCGAGCTGTTGACAATTAATCATCGAACTAGTTTAATGTGTGGAAGCG 

GCCGCTTTCATATCCCTCCGCAAATGGAGAAAAAAATCACTGGATATACCAC 

CGTTGATATATCCCAATGGCATCGTAAAGAACATnTGAGGCATTTCAGTCA 

GTTGCTCAATGTACCrATAACCAGACCGTTCAGCTGGATATTACGGCCTTTTT 

AAAGACCGTAAAGAAAAATAAGCACAAGTTTTATCCGGCCTTrATTCACATT 

CTTGCCCGCCTGATGAATGCTCATCCGGAATTCCGTATGGCA ATGA AAGACG 

GTGAGCTGGTGATATGGGATAGTGTTCACCCTTGTTACACCGTTTTCCATGAG 

CAAACTGAAACGTTTTCATCGCTCTGGAGtGAATACCACGACGATTTCCGGC 

AGTTTCTACACATATATTCGCAAGATGTGGCGTGTTACGGTGAAAACCTGGC 

CTATTTCCCTAAAGGGTTTATTGAGAATATGTTTTTCGTCTCAGCCAATCCCT 

GGGTGAGTTTCACCAGTTTTGATrTAAACGTGGCCAATATGGACAACTTCTTC 

GCCCCCGTTTTCACCATGGGCAAATATTATACGCAAGGCGACAAGGTGCTGA 

TGCCGCTGGCGATTCAGGTTCATCATGCCGTCTGTGATGGCTTCCATGTCGGC 

AGAATGCTTAATGAATTACAACAGTACTGCGATGAGTGGCAGGGCGGGGCG 

TAATTTTTTTAAGGCAGTTATTGGTGCCCTTAAACGCCTGGTGCTACGCCTGA 

ATAAGTGATAATAAGCGGATGAATGGCAGAAATTCGAAAGCAAATTCGACC 

CGGTCGTCGGTTCAGGGCAGGGTCGTTAAATAGCCGCTTATGTCTATTGCTG 

GTTTACGGTTTATTGACTACCCGAAGCAGTGTGACCCTGTGCTTCTCAAATGC 
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CTGAGGGCAGTTTGCTCAGGTCTCCCGTGGGGGGGAATAATTAACGGTATGA 

GCCTTACGGCGGACGGATCGTGGCCGCAAGTGGGTCCGGCTAGAGGATCCG 

ACACCATCGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGA 

AGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGTTATACGATGTC 

GCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCGCGTGGTGAACCAGG 

CCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGGCGG 

AGCTGAATTACATTCCCAACCGCGTGGCACAACAACTGGCGGGCAAACAGTC 

GTTGCTGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCCGTCGCAA 

ATTGTCGCGGCGATTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGTGG 

TGTCGATGGTAGAACGAAGCGGCGTCGAAGCCTGTAAAGCGGCGGTGCACA 

ATCTTCTCGCGCAACGGGTCAGTGGGCTGATCATTAACTATCCGCTGGATGA 

CCAGGATGCCATTGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTC 

TTGATGTCTCTGACCAGACACCCATCAACAGTATTATTTTCTCCCATGAAGAC 

GGTACGCGACTGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCG 

CGCTGTTAGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGC 

TGGCATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAG 

GCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAATGA 

GGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCGCTGGGC 

GCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGGATATCTCGG 

TAGTGGGATACGACGATACCGAAGACAGCTCATGTTATATCCCGCCGTCAAC 

CACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACCAGCGcGGACCGCTTG 

CTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCAATCAGCTGTTGCCCGTCT 

CACTGGTGAAAAGAAAAACCACCCTGGCGCCCAATACGCAAACCGCCTCTCC 

CCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTG 

GAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAG 

GCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATAATGTGTGGAATTGT 

GAGCGGATAACAATTTCACACAGCGGCCGCTGAGAAAAAGCGAAGCGGCAC 

TGCTCTTTAACAATTTATCAGACAATCTGTGTGGGCACTCGAAGATACGGAT 

TCTTAACGTCGCAAGACGAAAAATGAATACCAAGTCTCAAGAGTGAACACG 

TAATTCATTACGAAGTTTAATTCTTTGAGCGTCAAACTTTT 
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AAATTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAAC 

ACATGCAAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTGACGAGTG 

GCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTA 

CTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGAC 

CTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGG 

GGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAG 

CCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGG 

GAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAG 

AAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGrr 

AATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGT 

GCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGG 

GCGTAAAGCGCACGCAGGCGGnTGTTAAGTCAGATGTGAAATCCCCGGGCT 

CAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGG 

TAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGG 

TGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTG 

GGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCG 

ACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTC 

GACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGG 

GGCCCGCACAAGCGGCGGAGCATGTGGATTAATTCGATGCAACGCGAAGAA 

CCTTACCTGGGTTTGACATGCACAGGACGCGTCTAGAGATAGGCGTTCCCTT 

GTGGCCTGTGTGCAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATG 

TTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTCTCATGTTGCCAGCACGT 

AATGGTGGGGACTCGTGAGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGG 

GATGACGTCAAGTCATCATGCCCCTTATGTCCAGGGCTTCACACATGCTACA 

ATGGCCGGTACAAAGGGCTGCGATGCCGCGAGGTTAAGCGAATCCTTAAAA 

GCCGGTCTCAGTTCGGATCGGGGTCTGCAACTCGACCCCGTGAAGTCGGAGT 

CGCTAGTAATCGCAGATCAGCAACGCTGCGGTGAATACGTTCCCGGGCCTTG 

TACACACCGCCCGTCACGTCATGAAAGTCGGTAACACCCGAAGCCAGTGGCC 

TAACCCTCGGGAGGGAGCTGTCGAAGGTGGGATCGGCGATTGGGACGAAGT 

CGTAACAAGGTAACCGTAGGGGAACCTGCGGTTGGATCATGGGATTACCTTA 

AAGAAGCGTACTTTGTAGTGCTCACACAGATTGTCTGATAGAAAGTGAAAAG 

CAAGGCGTTTACGCGTTGGGAGTGAGGCTGAAGAGAATAAGGCCGTTCGCTT 

TCTATTAATGAAAGCTCACCCTACACGAAAATATCACGCAACGCGTGATAAG 

CAATTTTCGTGTCCCCTTCGTCTAGACGTAGCGCCGATGGTAGTGTGGGGTCT 

CCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCA 

GTCGAAAGACTGGGCCmCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCC 

TGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGCAACGGC 

CCGGAGGGTGGCGGGCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTA 

AGCAGAAGGCCATCCTGACGGATGGCCTTTrTGCGTTTCTACAAACTCTTCCT 

GTCGTCACrcCAGGCATGCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCT 

GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCA 

TAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGC 

GTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATT 

AATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTC 

CGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCG 
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GTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATA 

ACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGT 

AAAAAGGCCGCGTTGCTGGCGTTnTCCATAGGCTCCGCCCCCCTGACGAGC 

ATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTAT 

AAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCG 

ACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGC 

GCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCT 

CCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCtGCGCCTT 

ATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCA 

CTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGT 

GCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAQ 

TATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGG 

TAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTT 

GCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGA 

TCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTC ACGTT AAGGGAT 

mGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCrTTTAAATTAAA 

AATGAAGTTnTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAG 

TTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTT 

CATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGG 

CTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCG 

GCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGA 

AGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGA 

AGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATT 

GCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTC 

CGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAA 

GCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAG 

TGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTrACTGTCATGCCA 

TCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGA 

ATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAAT 

ACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTT 

CGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTA 

ACCCACTCGTGCACCCAACTGATCTTCAGCATCrnTACTTTCACCAGCGTTT 

CTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGC/s^AAAGGGAATAAGG 

GCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAG 

CATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGA 

AAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGA 

CGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATC 

ACGAGGCCCTTTCGTCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGAC 

ACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAG 

CAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTG 

GCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGG 

TGTGAAATAGCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCAT 

TCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTT 

CGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTG 

GGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAA 
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TTCGAGCTCGGTACCTGCAGTGACGACAGGAAGAGTTTGTAGAAACGCAAA 

AAGGCCATCCGTCAGGATOGCCnTCTGCTTAATTTGATGCCrGGCAGrrTATG 

GCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTGCTTCGCAACGTTCAAATC 

CGCTCCCGGCGGATTTGTCCTACTCAGGAGAGCGTTCA CCGA CAAACAACAG 

ATAAAACGAAAGGCCCAGTCTTTCGACTGAGCCTTTCGTnTATTTGATGCCT 

GGCAGTTCCCTACTCTCGCATGGGGAGACCCCACACTACCATCGGCGCTACG 

TCTAGATTATTTGTAGAGCTCATCCATGCCATGTGTAATCCCAGCAGCAGTTA 

CAAACTCAAGAAGGACCATGTGGTCACGCTTTTCGTTGGGATCTTTCGAAAG 

GGCAGATTGTGTCGACAGGTAATGGTTGTCTGGTAAAAGGACAGGGCCATCG 

CCAATTGGAGTATTTTGTTGATAATGGTCTGCTAGTTGAACGGATCCATCTTC 

AATGTTGTGGCGAATTTTGAAGTTAGCTTTGATTCCATTCrnTGTTTGTCTGC 

CGTGATGTATACATTGTGTGAGTTATAGTTGTACTCGAGTTTGTGTCCGAGAA 

TGTTTCCATCTTCTTTAAAATCAATACCTTTTAACTCGATACGATTAACAAGG 

GTATCACCTTCAAACTTGACTTCAGCACGCGTCTTGTAGTTCCCGTCATCTTT 

GAAAGATATAGTGCGTTCCTGTACATAACCTTCGGGCATGGCACTCTTGAAA 

AAGTCATGCCGTTTCATATGATCCGGATAACGGGAAAAGCATTGAACACCAT 

AAGAGAAAGTAGTGACAAGTGTTGGCCATGGAACAGGTAGTTTTCCAGTAGT 

GCAAATAAATTTAAGGGTAAGCTTTCCGTATGTAGCATCACCTTCACCCTCTC 

CACTGACAGAAAATTTGTGCCCATTAACATCACCATCTAATTCAACAAGAAT 

TGGGACAACTCCAGTGAAAAGTTCTTCTCCTTTGCTAGCAGTGATmTTTCT 

CCATTTGCGGAGGGATATGAAAGCGGCCGCTTCCACACATTAAACTAGTTCG 

ATGATTAATTGTCAACAGCTCGCCGGCGGCACCTCGCTAACGGATTCACCAC 

TCCAAGAATTGGAGCCAATCGATTCTTGCGGAGAACTGTGAATGCGGGTACC 

CAGATCCGGAACATAATGGTGCAGGGCGCTGACTTCCGCGTTTCCAGACTTT 

ACGAAACACGGAAACCGAAGACCATTCATGTTGTTGCTCAGGTCGCAGACGT 

TTTGCAGCAGCAGTCGCTTCACGTTCGCTCGCGTATCGGTGATTCATTCTGCT 

AACCAGTAAGGCAACCCCGCCAGCCTAGCCGGGTCCTCAACGACAGGAGCA 

CGATCATGCGCACCCGTGGCCAGGACCCAACGCTGCCCGAGATGCGCCGCGT 

GCGGCTGCTGGAGATGGCGGACGCGATGGATATGTTCTGCCAAGGGTTGGTT 

TGCGCATTCACAGTTCTCCGCAAGAATCGATTGGCTCCAATTCTTGGAGTGGT 

GAATCCGTTAGCGAGGTGCCGCCGGCGAGCTGTTGACAATTAATCATCGAAC 

TAGTTTAATGTGTGGAAGCGGCCGCTTTCATATCCCTCCGCAAATGGAGAAA 

AAAATCACTGGATATACCACCGTTGATATATCCCAATGGCATCGTAAAGAAC 

ATTTTGAGGCATTTCAGTCAGTTGCTCAATGTACCTATAACCAGACCGTTCAG 

CTGGATATTACGGCCTTTTTAAAGACCGTAAAGAAAAATAAGCACAAGTTTT 

ATCCGGCCTTTATTCACATTCTTGCCCGCCTGATGAATGCTCATCCGGAATTC 

CGTATGGCAATGAAAGACGGTGAGCTGGTGATATGGGATAGTGTTCACCCTT 

GTTACACCGTTTTCCATGAGCAAACTGAAACGTTTTCATCGCTCTGGAGTGA 

ATACCACGACGATTTCCGGCAGTTTCTACACATATATTCGCAAGATGTGGCG 

TGTTACGGTGAAAACCTGGCCTATITCCCTAAAGGGTTTATTGAGAATATGTT 

mCGTCTCAGCCAATCCCTGGGTGAGTITCACCAGTTTTGATTTAAACGTGG 

CCAATATGGACAACTTCTTCGCCCCCGTTTTCACCATGGGCAAATATTATACG 

CAAGGCGACAAGGTGCTGATGCCGCTGGCGATTCAGGTTCATCATGCCGTCT 

GTGATGGCTTCCATGTCGGCAGAATGCTTAATGAATTACAACAGTACTGCGA 

TGAGTGGCAGGGCGGGGCGTAATTTTTTTAAGGCAOTTATTGGTGCCCTTAA 
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ACGCCTGGTGCTACGCCTGAATAAGTGATAATAAGCGGATGAATGGCAGAA 

ATTCGAAAGCAAATTCGACCCGGTCGTCGGTTCAGGGCAGGGTCGTTAAATA 

GCCGCTTATGTCTATTGCTGGTTTACGGTTTATTGACTACCCGAAGCAGTGTC 

ACCCTGTGCTTCTCAAATGCCTGAGGGCAGTTTGCTCAGGTCTCCCGTGGGG 

GGGAATAATTAACGGTATGAGCCTTACGGCGGACGGATCGTGGCCGCAAGT 

GGGTCCGGCTAGAGGATCCGACACCATCGAATGGTGCAAAACCTTTCGCGGT 

ATGGCATGATAGCGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAA 

CCAGTAACGTTATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCG 

TTTCCCGCGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAA 

AGTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACA 

ACAACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCTG 

GCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCCGATC 

AACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGCGTCGAAG 

CCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGGGTCAGTGGGCTGAT 

TATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGTGGAAGCTGCCTGC 

ACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGACCAGACACCCATCAACAG 

TATTATTTTCTCCCATGAAGAGGGTACGCGACTGGGCGTGGAGCATCTGGTC 

GCATTGGGcCACCAGCAAATCGCGCTGTTAGCGGGCCCATTAAGTTCTGTCTC 

GGCGCGTCTGCGTCTGGCTGGCTGGCATAAATATCTCACTCGCAATCAAATT 

CAGCCGATAGCGGAACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTCAA 

CAAACCATGCAAATGCTGAATGAGGGCATCGTTCCCACTGCGATGCTGGTTG 

CCAACGATCAGATGGCGCTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCT 

GCGCGTTGGTGCGGATATCTCGGTAGTGGGATACGACGATACCGAAGACAG 

CTCATGTTATATCCCGCCGTCAACCACCATCAAACAGGATTTTCGCCTGCTGG 

GGCAAACCAGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAA 

GGGCAATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCG 

CCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGC 

TGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATT 

AATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCC 

GGCTCGTATAATGTGTGGAATTGTGAGCGGATAACAATTTCACACAGCGGCC 

GCTGAGAAAAAGCGAAGCGGCACTGCTCTTTAACAATTTATCAGACAATCTG 

TGTGGGCACTCGAAGATACGGATTCTTAACGTCGCAAGACGAAAAATGAAT 

ACCAAGTCTCAAGAGTGAACACGTAATTCATTACGAAGTTTAATTCTTTGAG 

CGTCAAACTTTT 
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